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Abstract. We use the recently developed theory of forest algebras to find algebraic char- 
acterizations of the languages of unranked trees and forests definable in various logics. 
These include the temporal logics CTL and EF, and first-order logic over the ancestor 
relation. While the characterizations are in general non-effective, we are able to use them 
to formulate necessary conditions for definability and provide new proofs that a number 
of languages are not definable in these logics. 



Logics for specifying properties of labeled trees play an important role in several areas of 
Computer Science. We say that a class of regular languages of trees ^ has an effective 
characterization if there is an algorithm which decides if a given regular language of trees 
belongs to Effective characterizations are known only for a few logics. In particular, we 
do not know if such characterizations exist for the classes of languages defined by the most 
common logics such as : CTL, CTL*, PDL, or first-order logic with the ancestor relation. 

In this paper we consider logics for unranked trees, in which there is no a priori bound 
on the number of children a node may have. Many such logics, including all the logics that 
are considered in this paper, are no more expressive than monadic second-order logic, and 
thus the properties they define can be described using automata. Barcelo and Libkin ^ 
and Libkin |15) catalogue a number of such logics and contrast their expressive power. We 
use recently developed theory of forest algebras to find algebraic characterisations of the 
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languages of unranked trees definable in some most common logics. While the characteriza- 
tions are in general non-effective, we are able to use them to formulate necessary conditions 
for definability and provide new proofs that a number of languages are not definable in 
these logics. 

For properties of words, such questions have been fruitfully studied by algebraic means. 
Whether or not a regular word language L can be defined in a given logic can often be 
determined by verifying some property of the syntactic monoid of L — the transition monoid 
of the minimal automaton of L. The earliest work in this direction is due to McNaughton 
and Papert [20] who studied first-order logic with linear order, and showed that a language 
is definable in this logic if and only if its syntactic monoid is aperiodic — that is, contains no 
nontrivial groups. A comprehensive survey treating many different predicate logics is given 
in Straubing |23]; temporal logics are studied by Cohen, Perrin and Pin [8j and Wilke [24], 
among others. 

Algebraic techniques provide a striking alternative to purely model-theoretic methods 
for studying the expressive power of logics over words. In many cases they have led to 
effective characterizations of certain logics, and actually to reasonably efficient algorithms. 
Even in the absence of effective characterizations, it is frequently possible to obtain effective 
necessary conditions for expressibility in a logic and use these to show the non-expressibility 
of certain languages. For instance, the strictness of the S^-hierarchy in first-order logic on 
words-the dot-depth hierarchy- was first proved by such algebraic means (Brzozowski and 
Knast Straubing [22]), while effective characterization of the levels of the hierarchy 
remains an open problem. 

There have been a number of efforts to extend this algebraic theory to trees; a notable 
recent instance is in the work of Esik and Weil on preclones [UlTO]. Recently, Bojahczyk and 
Walukiewicz p] introduced forest algebras, and along with it the syntactic forest algebra, 
which generalize monoids and the syntactic monoid for languages of forests of unranked 
trees. This algebraic model is rather simple, and in contrast to others studied in the 
literature, has already yielded effective criteria for definability in a number of logics: see 
Bojahczyk [4j, Bojahczyk-Segoufin-Straubing [6], Bojahczyk-Segoufin [5j. Forest algebras 
are also implicit in the work of Benedikt and Segoufin [2j on first-order logic with successor 
and of Place and Segoufin [18] on locally testable tree languages. 

In the present paper we continue the study of forest algebras, by developing a theory of 
composition of forest algebras, using the wreath product. The wreath product of transforma- 
tion monoids plays an important role in the theory for words. In particular, it is connected 
to a composition operation on languages and to generalized temporal operators. This paper 
is concerned with describing the connection between formula composition and the wreath 
product of forest algebras, in the case of unranked trees. Here is a brief summary of our 
results: 

(1) To each logic =Sf among EF, CTL, CTL*, first-order logic with ancestor, PDL and 
graded PDL, we associate a class of forest algebras, called the base of .if. We show 
that a language of forests is definable in the logic ^ if and only if it is recognized by 
an iterated wreath product of the forest algebras from the base of (Theorem 15.21 ) 

(2) In the cases of EF and CTL, the base has a single forest algebra. For the other cases 
we show that there is no finite base. As a consequence, none of these logics can be 
generated by a finite collection of generalized temporal operators. Using our algebraic 
framework, we give a simple and general proof of this fact. (Theorem 15.51 ) 
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(3) For the logics that do not have a finite base, we give an effective characterization of the 
base. (Theorems 15.31 and 15.41 ) Note that an effective characterization of a base does 
not imply an effective characterization for wreath products of the base, so this result 
does not give an effective characterization of any of the logics mentioned in item (1). 

(4) Going one step further, we provide an effective characterization for the path languages 
(Theorem 15. 4p : boolean combinations of languages from the base of graded PDL. 

(5) We give a new proof, based on the wreath product, of an effective characterization 
of the logic EF. This result was proved earlier by other means. (Bojanczyk and 
Walukiewicz [3].) Our argument here computes a decomposition based on the ideal 
structure of the underlying forest algebra. 

(6) Although we do not find effective characterizations for other prominent logics from our 
list, we are able to use our framework to establish necessary conditions for definability 
in these logics, and consequently to prove that a number of specific languages are not 
definable in them. (Theorem 18.21 ) 

(7) We give an effective characterization of CTL* languages within first-order definable 
languages. Similarly for PDL languages within languages definable in graded PDL 
(Theorem [921) 

Plan of the paper. In Sections 2-4 we present the basic terminology concerning, respectively, 
trees, logic, and forest algebras. Our treatment of temporal logics is somewhat unorthodox, 
since our algebraic theory requires us to interpret formulas in forests as well as in trees, 
therefore the precise syntax and semantics are different in the two cases. Section 4 includes 
a detailed treatment of the wreath product of forest algebras. In Section 5 we establish 
the first of our main results, giving wreath product characterizations of all the logics under 
consideration. In Section 6 we give the effective characterization of EF, and in Section 7 the 
necessary conditions for definability in the other logics. Section 8 is devoted to applications 
of these conditions. 

We note that Esik and Ivan [T2] have done work of a similar flavor for CTL (for 
trees of bounded rank). Our work here is of considerably larger scope, both in the number 
of different logics considered, and the concrete consequences our algebraic theory permits 
us to deduce. 

The present article is the complete version of an extended abstract presented at the 
2009 IEEE Symposium on Logic in Computer Science. 



Let A be a finite alphabet. Formally, forests and trees over A are expressions generated by 
the following rules: (i) if s is a forest and a (z A then as is a tree; (ii) if {ti, . . . ,tk) is a 
finite sequence of trees, then ti -|- • • • -|- t^, is a forest. We permit this summation to take 
place over an empty sequence, yielding the empty forest, which we denote by 0, and which 
gets the recursion started. So, for example, the following forest with two roots 



2. Trees, Forests and Contexts 




is described by the expression 



a(oO + 6(cO + bO + cO)) + b{aO + 60). 
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Normally, when we write such expressions, we delete the zeros. We denote the set of forests 
over A by Ha- This set forms a monoid with respect to forest concatenation s + t, with the 
empty forest being the identity. We denote the set of trees over Ahy Ta- 

If X is a node in a forest, then the subtree of x is simply the tree rooted at x, and the 
subforest of x is the forest consisting of all subtrees of the children of x. In other words, if 
the subtree of x is as, with a £ A and s G Ha, then the subforest of x is s. Note that the 
subforest of x does not include the node x itself, and is empty if j; is a leaf. 

A forest language over A is any subset of Ha- 

A context p over A is formed by replacing a leaf of a nonempty forest by a special 
symbol □. Think of □ as a kind of place-holder, or hole- Given a context p and a forest s, 
we form a forest ps upon substituting s for the hole in p. In the interpretation of forests as 
expressions, this really is just substitution of the expression s for the hole of p; the graphical 
interpretation of this operation is depicted below. 




In a similar manner, we can substitute another context q for the hole, and obtain a 
new context pq- We obtain in this way a composition operation on contexts. We denote the 
set of contexts over Ahy Va- This set forms a monoid, with respect to this composition 
operation, with the empty context □ as the identity. 

Note that for any s, t G Ha, Va contains a context s + □ + i, in which the hole has no 
parent, such that (s + □ + t)u = s + u + t for all n G Ha- 

Our trees, forests and contexts are ordered, so that s + t is a different forest from 
t -\- s unless s = t ox one of s,t is 0. This noncommutativity is important in a number of 
applications. However the present article really deals with unordered trees, so there is no 
harm in thinking of + as a commutative operation on forests. 

3. Logics for Forest Languages 

We can define regular forest languages by means of an automaton model that is a minor 
modification of the standard bottom-up tree automaton. The transition function has to 
be altered to cope with unbounded branching, and the acceptance condition needs to take 
account of the sequence of states in the roots of all the trees in the forest. See [3] for a 
precise definition of such an automaton model. The usual equivalence between monadic 
second-order logic and regularity holds in this setting. 

For a general treatment of predicate and temporal logics for unranked trees, we refer 
the reader to Libkin jl5j and Barcelo-Libkin pj. We will have to give a somewhat different 
description of similar logics in order to express properties of forests as well as of trees. In 
all cases the logics that we describe are fragments of monadic second-order logic, and thus 
the languages they define are all regular forest languages. 
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3.1. First-order logic for trees and forests. Let A be a finite alphabet. Consider first- 
order logic equipped with unary predicates Qa for each a e A, and a single binary predicate 

-< . Variables arc interpreted as nodes in forests over A. Formula QaX is interpreted to mean 
that node x is labeled a, and x ^ y to mean that node x is a (non-strict) ancestor of node 
y. A sentence (f) — that is, a formula without free variables — consequently defines a language 
C Ha consisting of forests over A that satisfy 0. For example, the sentence 

3x3y{QaX A Qay A ^{x ^ y) A ^{y ^ x)) 

defines the set of forests containing two incomparable occurrences of a. Wc denote this 
logic by FO[-<]. Note that this logic has no predicates to access the order of siblings. In 
particular, any language defined by the logic will be horizontally commutative, i.e. closed 
under reordering sibling trees. 

It is more traditional to consider logics over trees rather than over forests. For F0[-<] 
we need not worry too much about this distinction, since we can express in first-order logic 
the property that a forest has exactly one root (by the sentence 3xVy(a; -< y))- Thus the 
question of whether a given set of trees is first-order definable does not depend on whether 
we choose to interpret sentences in trees or in forests. 

3.2. Temporal logics. Wc describe here a general framework for temporal logics inter- 
preted in trees and forests. By setting appropriate parameters in the framework we generate 
all sorts of temporal logics that are traditionally studied. 

The general framework is called graded propositional dynamic logic (graded PDL). 
Syntax of temporal formulas. We distinguish between two kinds of formulas: tree formulas 
and forest formulas. The syntax of these formulas is defined by mutual recursion, as follows: 

• T and F are forest formulas. 

• If a & A, then a is a tree formula. (Such formulas are called label formulas.) 

• Finite boolean combinations of tree formulas are tree formulas, and finite boolean com- 
binations of forest formulas are forest formulas. 

• Every forest formula is a tree formula. 

• Before defining the key construction wc need to introduce the concept of an unambiguous 
set of formulas. Such a set $ = {(pi, . . . ,(/>n+i} is constructed from a sequence of tree 
formulas ilji,...,ipn by a simple syntactic operation ensuring that every tree satisfies 
exactly one formula from $: 

(f)i = tljiA yy -'i/jj for z = 1, . . . , n and = f\ ^tpj. 

j<i—l i<™ 

• If ^> is a finite unambiguous set of tree formulas, > is an integer, and L C $* is a 
regular language then E^L is a forest formula. 

Semantics of temporal formulas. We define two notions of satisfaction: tree satisfaction 
t \=t 4>, where t is a tree and (/> is a tree formula, which coincides with the usual notion of 
satisfaction; and forest satisfaction t \=f (p, where t is a forest and (p is a forest formula, 
which is somewhat unusual. Again, these relations are defined by mutual recursion. 

• lite Ha then t ^/ T and t ^/ F. 

• li t £ Ta and a e A, then t \=t a if and only if the root node of t is labeled a. 

• Boolean operations have their usual meaning; e.g., if t € Ha and (pi, (j)2 are forest 
formulas, then t \= f (pi A (p2 if and only if t \=f (pi and t \=f (p2- 
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• Let be a forest formula and t & Ta, so that t = as for a unique s G Ha, a & A. Then 
t \=t (f) ii and only il s \=f (p. 

• Let k > 0, and let <1> be a finite unambiguous set of tree formulas, with L Q ^* a regular 
language. Let s € Ha- If a; is a node of s, then we label the node by (pi if tx \=t (pi, 
where tx is the subtree of x. Note that because of the unambiguity requirement, there is 
exactly one such label. A path xi, . . . ,Xn of consecutive nodes of s beginning at a root, 
but not necessarily extending to a leaf, thus yields a unique word 4>i-^ . ■ ■ (pi^ € <!**, which 
we call the $-path of a;„. We say s \=f E^L if there are at least k nodes in the forest 
whose $-path belongs to L. 

We stress that in counting paths, we do not require the paths to be disjoint, and we do not 
require them to extend all the way to the leaves. For example, the forest aa + a contains 
three different nonempty paths from the root, so this forest satisfies the formula E^a"*". 
Given a temporal formula tp, we write for the set of forests that forest satisfy ip: 

= {s e Ha : s \=f Ip} 

We specialize the above framework by restricting either the value of k or the language 

L in the application of the operator E^L, or both. This leads, in the case of trees, to some 
logics that have been widely studied. We catalogue these below: 

EF. As a first example, we show how to implement the operator "there exists a descendant" , 
often denoted by EF. This example also highlights the difference between tree satisfaction 
and forest satisfaction. Consider the special case of E'^L where, for some tree formula ip, 

$ = A; = 1 L = {^ip)*ip. (3.1) 

In this special case, we write EEip instead of E^L. It is easy to see that EFip is forest satisfied 

by a forest s if and only if ip is satisfied by some subtree of s. In the special case when 
s is a tree, this subtree may be s itself. The semantics shows how to interpret EFip as a 
tree formula. If t is a tree, then t tree satisfies EFip if and only if t has a proper subtree 
that satisfies ip. In other words, tree satisfaction of Eip corresponds to the so called strict 
semantics, while forest satisfaction of Eip corresponds to the non strict semantics. We will 
use the term EF for the fragment of graded PDL where the operator E^L is only used in 
the special case of EFV'. 

CTL. As a second example, we show how to implement the operator Eip\J(p of CTL. Consider 
the special case of E'^L where, for some tree formulas ip and cp, 

^ = {iP A^(P,^iP A^(P,(P} k = l L= {ip A^(p)*(p. (3.2) 

In this special case, we write Eip\J(p instead of E'^L. It is easy to see that E^U^ is forest 
satisfied by a forest s if and only if the subtree of some node x tree satisfies the formula 
(p, and the subtree at every proper ancestor of x tree satisfies ip. Let us look now at the 
tree semantics of the formula Eip\J(p. If t is a tree, then t tree satisfies EipUcp if and only 
if the subtree of some non-root node x tree satisfies (p, and every non-root proper ancestor 
of X tree satisfies ip. As was the case with the operator EF, tree satisfaction corresponds 
to strict semantics and forest satisfaction corresponds to non strict semantics. We will use 
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the term CTL for the fragment of graded PDL where the operator E.'^L is only used in the 
special case 

First-order logic. We use our temporal framework to characterize the languages definable 
in FOH]. 

Theorem 3.1. A forest language is definable in F0[-<] if and only if it is definable by a 
forest formula in which the operator E^L is restricted to word languages L that are first-order 
definable over an unambiguous finite alphabet <I> of tree formulas. 

Proof. The theorem is very similar to the result of Hafer and Thomas [14j who show that 
first-order logic coincides with CTL* on finite binary trees. The theorem is even closer to 
the result Moller and Rabinovich [17J who show that over infinite unranked trees Counting- 
CTL* is equivalent to Monadic Path Logic (MPL). To deduce our theorem from their result 
it is enough to clarify the relations between different logics. 

The logics considered by Moller and Rabinovich express properties of infinite unranked 
trees, which can have both infinite branches and finite branches that end in leaves. A 
maximal path is therefore defined as a path that begins in any node, is directed away from 
the root, and either continues infinitely or ends in a leaf. Monadic Path Logic (MPL) is 
the restriction of monadic second-order logic over the predicate -< in which second-order 
quantification is restricted to maximal paths. In other words, MPL is the extension of FO[~<] 
that allows quantification over maximal paths. Over infinite trees, MPL is more expressive 
than first-order logic, since it can define the property "some path contains infinitely many 
a's ", which cannot be defined in F0[^]. However, over finite unranked trees, -F0[^] has 
the same expressive power as MPL. This is because a maximal path in a finite tree can be 
described by its first node and the leaf where it ends. 

The logic counting-CTL* can be interpreted as the the fragment of graded PDL where 
the operator E'^L is only allowed in the following two restricted forms: 

• A next operator X'^. This formula holds in a tree if subtrees of at least k children of 
the root satisfy cj). If X'^^ is a formula of counting-CTL* and (/> is a translation of (f) into 
graded-PDL then YJ'cf) is translated into a tree formula E^L^ where L,^ = {</>}. Indeed, 
such a formula requires existence of k different paths of length 1 whose labellings belong 

to (j). 

• An existential path operator, which we denote here by E' (the original paper uses E, but 
we use E' to highlight the slight change in semantics). This operator works like our EL, 
but with the the difference that E'L is a tree formula, and the path begins in the unique 
root of the tree. Rabinovich and Moller require that L is definable in LTL, which is 
equivalent to first-order definability. 

So counting-CTL* can be translated to a fragment of graded-PDL using only first-order 
definable word languages in quantification. Hence, by the result of Moller and Rabinovich 
we get a translation of F0[^] to this fragment. The translation in the opposite direction 
is straightforward. □ 

"'^In most presentations CTL also has the "next" operator EX</), as well as the dual operator E-^{'ijjyicf>)- 
The next operator is redundant thanks to the strict semantics, and the dual operator is redundant in finite 
trees. 
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Note that Theorem 13.11 fails without the restriction on unambiguity of the alphabet 
<I>. For instance, if we took A = {a,b,c}, $ = {0i,02}, where 0i = a V c, 02 = ^ V c, 
then L = {(j)i4'2)~^ is first-order definable as a word language. One can imagine what the 
semantics of EL should be in the case of such a node labelled with c can be labelled 
either with (pi or with 02. With this semantics however, the language defined by EL is not 
first-order definable. (If it were, we would be able to define in first-order logic the set of 
forests consisting of a single path with an even number of occurrences of c.) 

Actually, one can show, using composition theorems similar to those used by Hafer and 
Thomas, or Moller and Rabinovich, that graded PDL has the same expressive power as 
chain logic, which is the fragment of monadic second order logic where set quantification is 
restricted to chains, i.e. subsets of paths. 

CTL* and PDL. Finally, we define two more temporal logics by modifying the definitions 
above. CTL* is like the fragment of temporal logic in Theorem 13. H except that we only 
allow A; = 1 in E'^L. In particular, CTL* is a subset of F0[^]. We also consider PDL, which 
is obtained by restricting the temporal formulas E'^L to /c = 1, but without the requirement 
that L be first-order definable. If we place no restriction on either the multiplicity k or the 
regular language L, we obtain graded PDL. 

3.3. Language composition and bases. In this section we provide a more general notion 
of temporal logic, where the operators are given by regular forest languages. This is similar 
to notions introduced by Esik in [IT]. The benefit of the general framework is twofold. 
First, it corresponds nicely with the algebraic notion of wreath product presented later in 
the paper. Second, it allows us to state and prove negative results, for instance our infinite 
base theorem, which says that the number of operators needed to obtain first-order logic is 
necessarily large. 

We introduce a composition operation on forest languages. Fix an alphabet A, and 
let {Li, . . . , Lfc} be a partition of H^. Let B = {bi, . . . , b^} be another alphabet, with one 
letter 6j for each block Lj of the partition. The partition and alphabet are used to define a 
relabeling 

teHA ^ t[Li,...,Lk]e HaxB 

in the following manner. The nodes in the forest t[Li, . . . , L^] are the same as in the forest 
t, but the labels are different. A node x that had label a in f gets label (a, bi) in the new 
forest, where bi corresponds to the unique language Lj that contains the subforest of x in 
t. For the partition and B as above, and L a language of forests over ^ x i?, we define 
L[Li, . . . , Lfc] C Ha to be the set of all forests t over A for which t[Li, . . . , L^] € L. 

The operation of language composition is similar to formula composition. The defi- 
nitions below use this intuition, in order to define a "temporal logic" based on operators 
given as forest languages. Formally, we will define the closure of a language class under 
language composition. First however, we need to comment on a technical detail concerning 
alphabets. In the discussion below, a forest language is given by two pieces of information: 
the forests it contains, and the input alphabet. For instance, we distinguish between the set 
Li of all forests over alphabet {a}, and the set L2 of all forests the alphabet {a, b} where b 
does not appear. The idea is that sometimes it is relevant to consider a language class ^ 
that contains Li but does not contain L2, such as the class of definite languages that only 
look at a bounded prefix of the input forest (such classes will not appear in this particular 
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Logic 


Languages in the language base for alphabet A 


EF 
CTL 

CTL* 
PDL 

graded PDL 


{ "some node with a" : a G A} 

{"some path in B*b" : B C A,b e A} 

{"at least k paths in L" : /c G N,L G FOa[<]} 

{"some path in L" : L e FOa[<]} 

{"some path in L C A"*"" : L regular} 

{ "at least k paths in L C A+" : A; G N, L regular} 



Figure 1: Language bases for temporal logics 



paper). This distinction will be captured by our notion of language class: a language class 
is actually a mapping which associates to each finite alphabet a class of languages over 
this alphabet. 

Let ^ be a class of forest languages, which will be called the language base. The tempo- 
ral logic with language base ^ is defined to be the smallest class TL[J^] of forest languages 
that contains C and is closed under boolean combinations and language composition, i.e. 

Li,...,Lfc,LGTL[if] ^ L[Li,...,L,,] GTL[^]. 

Formally speaking, in the above we should highlight the alphabets (the languages Li, . . . , 
and L[Li, . . . , L^.] belong to the part of TL[^] for alphabet A, while the language L belongs 
to the part of TL[^] for alphabet Ax as in the definition of the composition operation). 

We can translate the definitions of the temporal logics we have considered in terms of 
language composition. This gives the following theorem. 

Theorem 3.2. The logics EF, CTL, FO[<\, CTL* , PDL and graded PDL have language 
bases as depicted in FigureUl 

Note that the assertion about -F0[^] depends on Theorem 13. li 

4. Forest Algebras 

4.1. Definition of forest algebras. Forest algebras, introduced in [3j by Bojahczyk and 
Walukiewicz, extend the algebraic theory of syntactic monoid and syntactic morphism for 
regular languages of words to the setting of unranked trees and forests. A forest algebra is 
a pair (H, V) of monoids together with a faithful monoidal left action of V on the set H. 
This means that for all h & H, v € V, there exists vh & H such that (i) {vw)h = v{wh) for 
all v,w G V and h & H, (ii) if 1 G ^ is the identity element, then Ih = h for all h (z H, 
and (iii) if vh = v'h for all h ^ II, then v = v' . We write the operation in H additively, and 
denote the identity of iif by 0. We call H and V, respectively, the horizontal and vertical 
components of the forest algebra. The idea is that H represents forests and V represents 
contexts. As was the case with the addition in H^, this is not meant to suggest that H is 
a commutative monoid, although in all the applications in the present paper H will indeed 
be commutative. We require one additional condition: For each h & H there are elements 
1 + h,h + 1 G V such that for ah g e H, {1 + h)g = g + h, and {h + l)g = h + g. A 
consequence is that every element h ^ H can be written as /i = wO for some v , namely 
V = h-^\. A homomorphism of forest algebras consists of a pair of monoid homomorphisms 
{an, ay) : {H,V) {H',V') such that auivh) = ay{v)aH{h) for all u G F and h G H. 
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We usually drop the subscripts on the component morphisms and simply write a for both 
these maps. 

Of course, if j4 is a finite alphabet, then {Ha, Va) is a forest algebra. The empty forest 
is the identity of Ha, and the empty context □ is the identity of Va- This is the free forest 
algebra on A, and we denote it A^. It has the property that if {H, V) is any forest algebra 
and / : ^4 — )• y is a map, then there is a unique homomorphism a from A^ to {H, V) such 
that a{an\) = f{a) for all a & A. 

4.2. Recognition and syntactic forest algebra. Given a homomorphism a : A^ — > 
[H, V), and a subset X of H, we say that a recognizes the language L = a~^(X), and also 
that {H, V) recognizes L. A forest language is regular if and only if it is recognized in this 
fashion by a finite forest algebra. Moreover, for every forest language L C Ha, there is 
a special homomorphism : {Ha,Va) {Hl,Vl) recognizing L that is minimal in the 
sense that a^. is surjective, and factors through every homomorphism that recognizes L. 
We call ol the syntactic morphism of L, and {Hl,Vl) the syntactic forest algebra of L. 
If s,s' £ Ha, then 0^(5) = aiis') if and only if for all v G Va, vh £ L 4^ vh' G L. This 
equivalence is called the syntactic congruence of L. An important fact in applications of this 
theory is that one can effectively compute the syntactic morphism and algebra of a regular 
forest language L from any automaton that recognizes L. (See [3j.) 

We say that a forest algebra {Hi,Vi) divides (-^21^2), in symbols {Hi,Vi) -< {H2,V2) 
if [Hi,Vi) is a quotient of a subalgebra of (^^2,^2). In particular, {Hi,Vl) divides every 
forest algebra that recognizes L. 

There is a subtle point in the definition of division of forest algebras given above that 
we will need to address. We have defined this in a way that directly generalizes the standard 
notion of division of monoids: A divisor of a monoid M is a quotient of a submonoid of M. 
But a forest algebra, is, in particular, a transformation monoid, and there is a second notion 
of division, which comes from the theory of transformation monoids, that will be particularly 
useful when we deal with wreath products: We say that {H, V) tm-divides (H', V') if there 
is a submonoid K of H', and a surjective monoid homomorphism ^ : K ^ H such that for 
each V £V there exists v &V' with vK C K, and for all k £ K, 

\ii(vk) = v^{k). 

Fortunately, the two notions of division coincide, as shown in the following Lemma. 

Lemma 4.1. Let {Hi,Vi) and {H2,V2) be forest algebras. {Hi,Vi) -< {H2,V2) if and only 
if {Hi,Vi) tm-divides {H2,V2). 

Proof. First suppose {Hi,Vi) divides (H2,V2). Then there is a submonoid V of V2 and a 
forest algebra homomorphism 

a: {V' ■0,V')^ {Hi,Vi). 

(Strictly speaking, we should reduce V' to the quotient that acts faithfully on V' ■ 0, but 
leaving this reduction out does not change the argument.) Let v £Vi, and set v to be any 
element of V' such that a(v) = v. We then have for h €V' ■ 0, 

a{vh) = a{v)a{h) = va{h), 

so {Hi,Vi) tm-divides (i?2,V^2)- 
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Conversely, suppose {Hi,Vi) tm-divides (iJ2, V2), with underlying homomorphism a : 
H' ^ Hi. Let A be an alphabet at least as large as Vi, and let 7 : ^ — >■ Fi be an onto map. 
This extends, because of the universal property of the free forest algebra, to a (surjective) 
forest algebra homomorphism 7 : A'^ — )• {Hi, Vi). We define 6 : A ^ V2 hy setting 

S{a) = 7(a) 

for all a & A, and consider its extension 5 to a forest algebra homomorphism. It is enough 
to show that for x,y G Va, S{x) = S{y) imphes j{x) = 7(2/). This will imply that 7 factors 
through 5 and give the required division. 

Observe that if s G Ha, then 6{s) is in the domain H' of a, because s = x ■ for some 
X & Vi, and thus 

l{s) = 7(2^)7(0) 

= 7{x)a{6m 

= a{^)6{0)) 
= a(<5(a;)(5(0)) 
= a{6{x-0)) 
= crisis)). 

So by assumption, we have 

7(a)a(5(s)) = a(5(a)(^(s)) 
for all s G Ha, a € A. A straightforward induction on the number of nodes in x implies 
that for any a; G Va, 

7(a:)a((5(s)) = a{6{x)d{s)). 
Now suppose h e Hi and S{x) = S{y). As noted above, h = a{S{s)) for some s G Ha, and 
consequently 

7(x) • h = ^{x)a{6{s)) 
= a{6{x)6{s)) 

= aiWis)) 
= j{y)a{5{s)) 

= i{y) ■ h. 

Since h was arbitrary, we get 7(x) = 7(2/), by faithfulness. 

□ 



4.3. Wreath product. Here we introduce the wreath product of forest algebras. We first 
try to give some intuition behind the construction. The wreath product originally arose in 
the theory of permutation groups, but it was subsequently adapted to provide an algebraic 
model of serial composition of automata. The idea is that the first automaton reads an 
input word ai • • • a„ beginning in state qq. The second automaton sees both the run of the 
first automaton on this input string, as well as the original input string — that is, it reads 
the sequence 

(go, ai), (goai, 02), • • • , {qoai ■ ■ ■ an-i, an) 
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as an input word, beginning in its initial state po. This defines a composite action of words 
over the original input alphabet A on pairs of states {p, q). The wreath product is, essentially, 
the transition monoid of this action. 

The idea behind the wreath product of two forest algebras is also to model sequential 
composition. The first algebra 'runs' on an input forest, and then a second automaton runs 
on the same forest, but also gets to see to see the run of the first automaton. We will make 
this composition precise by defining the sequential composition of two homomorphisms. 
Assume that 

a:A^-^ {G, W) 

is a forest algebra homomorphism. For a forest t over A, let be the forest over A X G 
obtained from t by changing the label of each node x from a to the pair (a, g), where g (z G 
is the value assigned by a to the subforcst of x. In other words, is the forest t[Li, . . . , L^], 
where G = {gi, . . . , g^} and Lj = a~^{gi). The sequential composition, will use a second 
homomorphism that reads the relabeling f° and yields a value in a second forest algebra; 
that is, 

P:{AxG)^^ {H, V) . 
The sequential composition of a and P is the function a® P : Ha G x H defined by 

t ^ {a{t),p{n). 

The wreath product {G, W) o (H, V) of forest algebras is defined to capture this notion of 
sequential composition. While it is hardly surprising that there is an algebraic construction 
that models sequential composition for forests, just as there is such a construction for words, 
it is rather remarkable that the construction for forest algebras is identical to the one used 
for transformation monoids. (In fact, one could even argue that the wreath product is 
better suited to forest languages, since it works directly on the forest algebra, while for 
word languages one goes from monoids to transformation monoids.) 

Wc now present the definition of the wreath product of two forest algebras {Hi, Vi) and 
{H2, V2). This wreath product denoted by {Hi, Vi) o (i^s, V2). 

Note that forest algebras are transformation monoids, for which the wreath product is 
a classical operation. We will apply the classical definition without changes in this setting, 
yielding some of the ingredients of a forest algebra, namely: 1) the carriers of the horizontal 
and vertical monoid; 2) the action of the vertical monoid on the horizontal monoid; and 3) 
the composition operation in the vertical monoid. The missing ingredient, not given by the 
classical definition, will be 4) the monoid operation in the horizontal monoid. 

Wc describe below the classical definition of wreath product of transformation monoids, 
as applied to the special case of forest algebras. The states that are transformed, which in 
the case of forest algebras correspond to the horizontal monoid, are the cartesian product 
Hi X H2 with component- wise addition. The transforming monoid, which in the case of 
forest algebra corresponds to the vertical monoid, is more sophisticated, its carrier set is 
Vi X V^^ . The action of the transforming monoid Vi x V^^ on the transformed states 
Hi X H2 is defined by 

{vi,f){hi,h2) = {vihi,f{hi)h2). 
The composition operation in the transforming monoid Vi x is defined by 
{v, f) ■ {v', /') = K, /") f"{h) = {f{v'h)) ■ {f'{h)). 



WREATH PRODUCTS OF FOREST ALGEBRAS, WITH APPLICATIONS TO TREE LOGICS 13 



As is well known, this definition turns Vi x ^ into a monoid of faithful transformations on 
H\X H2. (Observe that since we define forest algebras using a left action of V on H, rather 
than a right action, our definition of the wreath product is the reverse of the customary 
one, with the first algebra in the composition written as the left-hand factor in the wreath 
product, rather than as the right-hand factor.) 

By applying the definition of wreath product for transformation monoids, we have 
obtained most of the ingredients of forest algebra. We are missing the monoid operation on 
the horizontal monoid; for this we use the usual direct product. 

The last missing condition is that for every element h of the horizontal monoid, a forest 
algebra should have elements 1 + h and /t + 1 of the vertical monoid that satisfy 

{l + h)g = g + h and {h + l)g = h + g. 

Wc show that these elements exist in the wreath product. Let then h = (hi, /12) £ Hi x H2. 
Consider the map f : Hi ^ V2 that sends every element to (1 + /i2). Then for any g = 
{91,92) e Hix H2, we have 

(1 + ^1,/) (51, 52) = ((l + ^i)5i,(l + M52) 
= (51+^1,52 + ^2) 
= {91,92) + {hi, h2). 

Therefore, the element (1 + hi, f) plays the role of 1 + (/ti, /i2)- Similarly, we find Vi x V.^^ 
contains the transformation (/ii, /12) + 1- 

Thus the wreath product of two forest algebras is a forest algebra. 

Well-known properties of the wreath product of transformation semigroups and monoids 
carry over unchanged to this setting. In particular, the wreath product is associative, so 
we can talk about the wreath product of any sequence of forest algebras, and about the 
iterated wreath product of an arbitrary number of copies of a single forest algebra. Likewise, 
the direct product of two forest algebras embeds in their wreath product in cither direction. 
As a consequence, if Li, L2 are recognized by forest algebras (i^i, Vi), {H2, V2) respectively, 
then their union and intersection are both recognized by {Hi, Vi) o {H2, V2). 

The connection with sequential composition is given by: 

Theorem 4.2. For every pair of forest algebra homomorphisms 

a:A^^{G,W) P : {A x G)^ ^ {H,V) . 

there is a homomorphism into the wreath product {G, W) o [H, V) that, when restricted to 
forests, is equal to the sequential composition 

a^p-.HA^GxH. 

Conversely, every homomorphism from a free forest algebra into the wreath product of 
two forest algebras is realized in this manner by the sequential composition of two homo- 
morphisms. 

Proof. Given homomorphisms a, /3 as above, consider the map from A into the vertical 
monoid of {G, W) o [H, V) given by 

a {a{an),fa), 

where for all a E A, g e G, 

fa{g)=f3{{a,g)n). 
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By the universal property of A^, this map extends to a unique homomorphism 7 with 
domain A^. A straightforward induction on the construction of a forest t G Ha shows that 
7(t) = (a(t), The crucial step is when t = as for some a G A, s £ Ha- We then have 

= (a, q(s)) • s", so that 

lit) = 7(0) • lis) 

= (a(an),/,)-(a(s),/3(5")) 

= (a(an)-a(5),/3(/„(a(s)))-/3(0) 
= iaias),l3iia,ais))-^is")) 

= (a(t),/3r)). 

Conversely, if 7 : ^ (G, VF) o {H, V) is a homomorphism, then for each a A, 
7(an)) has the form iwa, fa) for some Wa £ W, fa ■ G ^ V. We define homomorphisms 

a : A^ ^ {G,W) /3 : (A x G)^ ^ iH, V) . 

by setting, for each a G A, g £ G, 

a(an) = Wa l3iia,g)D) = faig) . 

As we saw above, a®/? is the unique homomorphism mapping aD to iwa, fa), so 7 = 0(81/3. 

□ 

5. Wreath Product Characterizations of Language Classes 

When £/ is a. class of forest algebras, we write TL[i2/] for the class of languages recognized by 
iterated wreath products of forest algebras from £/. The following corollary to Theorem l4.2l 
justifies this notation. 

Corollary 5.1. Let ^ he the class of languages recognized by a class of forest algebras £/ . 
Then Tl[^] = TL^]. 

We also say that ^ is an algebraic base of the language class TL[i2/] (note that there 
may be several algebraic bases, just as there may be several language bases). We will now 
exhibit algebraic bases for the logics discussed in Section O By the above corollary, all we 
need to do is to provide, for each logic, a class of forest algebras that captures the language 
base. We could, of course, simply say that an algebraic base consists of the syntactic 
forest algebras of the members of the language base, but we prefer more explicit algebraic 
descriptions. These are given in the following theorem; the algebras used in the statement 
are described immediately afterwards, while the detailed proofs are not given until Section 
7. 

Theorem 5.2. The logics EF, CTL, FO[^], CTL*, PDL and graded PDL have algebraic 
bases as depicted in Figure\M 

We now proceed to describe the algebras mentioned in Figure [2l The bases have been 
chosen so that each base is either finite, or in the case it is an infinite class of algebras, then 
it has an effective characterization, i.e. there is an algorithm that checks if the syntactic 
algebra of a given forest language belongs to the base. Furthermore, the infinite algebraic 
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Figure 2: Algebraic bases for temporal logics 



bases are given by identities in the forest algebra, and therefore the algorithm reduces to 
checking if the identities hold. 

First, we recall that an aperiodic finite monoid S is one that contains no nontrivial 
groups. Equivalently, there exists m > such that = for all s € S. When we say 

that a forest algebra {H, V) is aperiodic, we mean that the vertical monoid V is aperiodic 
(which implies that H is aperiodic). 

Ui is the forest algebra ({0, oo}, {1, 0}), with ■ oo = • = oo. Note that since we use 
additive notation in the horizontal monoid, the additive absorbing element is denoted oo, 
while the multiplicative absorbing element is 0. The vertical monoid of Ui is the unique 
smallest nontrivial aperiodic monoid, denoted Ui in the literature. Another description of 
Ui is that it is the syntactic forest algebra of the forest language "some node with a" over an 
alphabet A 3 a with at least two letters. If follows that every language in the language base 
of EF is recognized by Ui, and every language recognized by Ui is a boolean combination 
of members of the language base of EF, so this algebra forms an algebraic base for EF. 

U2 is the forest algebra ({0, 00}, {1, cq, Cqo}) with Ch ■ h' = h for all horizontal elements 
h,h'. If one reverses the action from left to right and ignores the additive structure, U2 is 
the aperiodic unit in the Krohn-Rhodes Theorem. The underlying monoid of this trans- 
formation semigroup is usually denoted C/2. Every language recognized by U2 is a boolean 
combination of members of the language base of CTL, and all languages recognized by U2 
are in CTL, so U2 forms an algebraic base for CTL. 

So much for the singleton bases. We now describe the infinite bases. 

A distributive algebra is a forest algebra {H, V) such that H is commutative and such 
that the action of y on if is distributive: v{hi + /12) = vhi + vh2 for all v (z V, hi,h2 H. 
The assertion that distributive algebras form algebraic bases for the given language classes 
is a consequence of the following theorem: 

Theorem 5.3. A forest language is a boolean combination of languages EL (respectively, 
languages EL with L first-order definable) if and only if it is recognized by a distributive 
forest algebra (respectively, an aperiodic distributive forest algebra). 

Let us define a path language to be any boolean combination of members of the language 
base of graded PDL, and an fo path language to be a boolean combination of members of 
the language base of F0[^]. We have the following analogue to Theorem 15. 3i 

Theorem 5.4. A finite forest algebra {H,V) recognizes only path languages if and only if 
H is aperiodic and commutative and 



vg + vh = v{g + /i) + f 



(5.1) 
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u{g + h) = u{g + uh) (5.2) 

hold for all g,h G H and u,v G V with = u. {H, V) recognizes only fo-path languages 
if and only if H is aperiodic and commutative, V is aperiodic, and {H, V) satisfies the two 
identities above. 

We define a path algebra to be a forest algebra {H, V) satisfying identities 15.11 and 15.21 
witli H aperiodic and commutative. We will give the proofs of Theorems 15.31 and 15.41 in 
Section [71 

Because of the connection with logic, we will call divisors of the six kinds of iterated 
wreath products described above EF-algebras, CTL-algebras, CTL*-algebras, FO-algebras, 
PDL-algebras, and graded PDL-algebras, respectively. 

Note that for EF and CTL, the algebraic base has one algebra, while our other bases 
contain infinitely many algebras. This turns out to be optimal, as stated below. 

Theorem 5.5 (Infinite base theorem). None of the language classes CTL*, F0[-<], PDL, 
or graded PDL has a finite algebraic base. 

Proof. If a language class has an algebraic base consisting of a finite set of forest algebras 

{H^,Vi),...,{Hu,Vt), 

then it has a base containing just the single algebra 

{H,V) = {Hi,Vi) X ■ ■ ■ x {Hk,Vk). 

This is because each of the {Hi,Vi) divides {H,V), and {H,V) embeds into the wreath 
product of the {Hi, Vi), in any order. Consequently, iterated wreath products of the {Hi, Vi) 
and iterated wreath products of {H, V) have the same divisors, and so recognize the same 
languages. 

By these observations, it suffices to show that none of the classes in the statement of 
the theorem has an algebraic base consisting of a single forest algebra {H, V). We will give 
two different arguments for this, one applicable to the aperiodic classes CTL* and F0[^], 
and the other for the nonaperiodic classes. 

Suppose the language class FO[-<\ is generated by a single algebra {H, V). Since {H, V) 
is required to recognize only languages in this class, V is aperiodic, and thus there is an 
integer n such that f " = v^'^'^ for all v ^V. We will show that no iterated wreath product of 
copies of {H, V) can recognize the language L„ consisting of all forests over A = {a, 6, c} in 
which there is a path from the root with the label in (a"6)*c. Since Ln is in CTL* C F0[^], 
this will give the desired conclusion also for CTL* . 

We prove this by induction on the number of factors k in the wreath product, showing 
that there are forests G Ln and ^ such that 4'{sk) = 4'{tk) is satisfied for every 
homomorphism (j) from into the /c-fold wreath product of {H,V). For A; = 1, we can 
simply take si = a"6c and ti = aP'~^^bc. For the inductive step we suppose the claim holds 
for some A: > 1, and let (G, W) denote the A;-fold wreath product of the {H, V). Consider a 
homomorphism (j) from A^ into the {k + l)-fold wreath product {G, W) o {H, V). Recalling 
the definition of the wreath product we have (j) : A^ ^ {G X H,W X V^). If we compose 
(j) with the projection onto the left coordinate we obtain a homomorphism -i/' into {G,W). 
Note that since aperiodicity is preserved under wreath products, there is an m such that 
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We first claim that if p and q are contexts in Va such that V(j') = V'('J'); then 
To see this, first take ((70, ho) \n G x H. We have 
so we have 

0('?"+')(9o,/io) = (5i,/ii), 
where ipip)9i = 9i- Let us write (j){p) as {ip{p), /), where f : G ^ V. We then have 

Since qq, ho are arbitrary, this proves (^{p'^q'^'^^) = (j){p^^^ q"^^^) , as claimed. We now make 
particular choices for p and q, namely 

p = ad + btf:, q = aO + bsk- 
Since V'(sfc) = V'(^fc); we have -(/^(p) = ip{q), and thus by our claim above, </)(p"'g™+^) = 
^(^n+i^m+i)_ g^^^ ^ pn^m+i . and tfc+i = • 0. So = (/>(tfc+i). For 

every path u) from the root in s^, there is a path in Sfc+i with label a^bw. On the other 
hand, for every path with a label a"6u from the root of t^+i we have v (z t^. Thus Sfc+i € L„ 
and tfc+i ^ Ln, as claimed. 

We now turn to the nonaperiodic case. Let p be a prime that does not divide the 
order of any group in (H, V), and let L be the set of forests over {a, b} in which there is a 
path from the root of the form a^b, where p divides m. We will show that (H, V) cannot 
recognize L. Since L has the form EK for a regular word language K, L is in PDL, so this 
will complete the proof. 

It is easy to see that the vertical monoid of the syntactic forest algebra of L contains a 
group of order p: Let < r < p, and let Hr be the set of forests in which every path from 
the root has an initial segment of the form a^b, where r = j mod p. Each Hr is a class of 
the syntactic congruence, all p of these classes are distinct, and the context aD cyclically 
permutes them. On the other hand, the set of simple groups dividing a transformation 
monoid is preserved under wreath product, so no iterated wreath product of copies of 
{H, V) can contain a group of order p, and thus cannot recognize L. 

□ 

6. EF 

The logic EF was one of the first logics over trees to have a decidable characterization [3]. 
The result has been since then reproved several times with different methods \25\ I13j. Here 
we give a new proof based on wreath product. Our argument is purely algebraic. It 
computes a decomposition based on the ideal structure of the underlying forest algebra. 
The following theorem is proved in [3j. 

Theorem 6.1. A forest language L C Ha is defined by a forest formula of EF if and only 
if (i) Hl is idempotent and commutative, and (ii) for every v & Vl, h & Hi, we have 
vh-\- h = vh. 
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Because this property can be effectively verified from tlie multiplication tables of Hl 
and Vl, we have an effective characterization of EF. More specifically, there is a decision 
procedure for determining whether or not a forest language given, say, by an automaton 
that recognizes it, is definable by a forest formula of EF. This procedure can also be adapted 
to testing whether a tree language is EF-definable with tree semantics. 

In light of Theorem 15.21 Theorem 16.11 can be formulated as follows. 

Theorem 6.2. A forest algebra {H,V) divides an iterated wreath product of copies ofUi if 
and only if H is idempotent and commutative, and vh + h = vh for all h H, v (z V. 

Note that Theorem 16.21 is purely algebraic. It makes no mention of trees, forests, 
languages or logic. This suggests that it might be proved reasoning solely from the structure 
of the forest algebra. 

Here we present such a proof. The easy direction is to show that every divisor of an 
iterated wreath product of copies of Ui is horizontally idempotent and commutative and 
satisfies the identity vh + h = vh. Identities are always preserved under division, and 
obviously Ui itself satisfies the properties, so we just need to show that the properties are 
preserved under wreath product. Let (G, W) and (iJ, V) be forest algebras satisfying the 
identity, with G, H idempotent and commutative. The horizontal monoid of the wreath 
product is just G x H, which is idempotent and commutative. Let h = {ho, hi) G {G,W), 
V = {vq, /) € X be horizontal and vertical elements of the wreath product. We have 

vh + h = {vo, f){ho,hi) + {ho,hi) 

= {voho + ho, f{ho)hi + hi) 

= {voho,f{ho)hi) 

= {vo, f){ho,hi) 

= vh. 

For the converse, we suppose {H, V) is horizontally idempotent and commutative and 
satisfies the identity. We prove by induction on \H\ that {H, V) divides an iterated wreath 
product of copies of Ui . 

Since H is idempotent and commutative, it is partially ordered by the relation < defined 
by hi < /i2 if and only if hi = /12 + /i for some h ^ H. Transitivity and refiexivity of this 
relation are obvious. Antisymmetry follows from the observation that if /ii = /12 + /i < /i2, 
then hi + h2 = h2 + h + h2 = h2 + h = hi. Thus if we have both hi < /12 and h2 < hi, then 
hi = hi + h2 = h2. This is just the standard j7-ordering, one of the Green relations, on the 
monoid H. Thus our identity vh + h = vh implies vh < h for all v €V, h € H. Conversely, 
if vh < h, then there is some h' £ H such that vh = h + h' , and thus vh + h = h + h' + h = 
h + h' = vh. So we can replace the identity by the inequality vh < h for all v € V, h £ H. 

The sum of all the elements of H is the (necessarily unique) absorbing element, which, 
following our usual practice, we denote 00. This is the unique <-minimal element, since 
obviously 00 + /i = 00 for all h £ H. If \H\ < 2, then (H, V) is either trivial, or isomorphic 
to Ui, so we can assume \H\ > 2. Thus there is at least one minimal element /i 7^ in 
H \{oo}. We call such an element a subminimal element. It has the property that for all 
v £ V, vh = h or vh = 00. 

For each subminimal h, we define Hh to be the set {00} U {g : h £ Vg}. Observe that 
Hfi is a submonoid of H, because if vihi = h and V2h2 = h, then 

h = h + h = vihi + V2h2 + /ii + /12 = u{hi + /12) where u = vihi + ^2/12 + 1- 
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For f € y and g G Hh we set v * g = vg if vg G Hh, and otherwise set v * g = oo. It is 
straightforward to verify that for aU vi,V2 & V, g & H^, 

vi* g + g = vi * g, 

{V1V2) * g = vi* {v2* g), 
so we get a well-defined action of V on Hh. We can collapse this action to make this faithful, 
and thus we get a well-defined forest algebra {Hfi,Vh), that satisfies the hypotheses of the 
theorem. If there is more than one subminimal element, then each Hh has strictly smaller 
cardinality than H. Further, consider the map 

i:{H,V)^ll{Hh,Vh), 

where the direct product is over all subminimal elements h, defined by setting the h- 
component of t{g) to be (7 if 5 € Hh, and 00 otherwise. It is straightforward to verify 
that i is a homomorphism embedding (H, V) into the direct product. Since the direct prod- 
uct in turn embeds into the wreath product, we get the result by the inductive hypothesis. 

It remains to consider the case where there is just one subminimal element h. In this 
case {Hh, Vh) is identical to {H, V). The elements of H different from 00 form a submonoid 
G of H. We get a well-defined action ** of F on G by setting v**g = vg if g G G, and 
v**g = h otherwise. Once again, the resulting forest algebra (G, W) satisfies the necessary 
identities, so by the inductive hypothesis (G, W) divides a wreath product of copies of Ui. 
We complete the proof by showing that (H, V) embeds in the wreath product (G, W) oUi. 
We map g G H — {00} to a{g) = {g, 0) and 00 to a(oo) = {h, 00). We further map v £ V to 
{v, fv), where fv{g) = if u 51 = 00, and fv{g) = 1, otherwise. This is obviously an injective 
homomorphism on the additive structure. To show that it is a homomorphism on the 
multiplicative structure, it suffices to show that for all u G y, 5 G H, a{vg) = {v, fy)a{g). 
There are several cases to consider. First, if 5 = 00, then vg = 00, so we have 

a{vg) = {h,oo) = {v**h, f^{h)oo) = {v,fy){h,(X)) = {v,f^)a{g). 

If (7 7^ 00 but vg = 00, we have 

a{vg) = {h,oo) = {v**g,0- 0) = {v**g,fy{g) ■ 0) = iv,f^){g,0) = {v,f^)a{g). 

Finally, if neither g nor vg is 00, we have 

a{vg) = {vg,0) = {v**g,l-0) = {v,fy){g,0) = {v,fy)a{g). 

□ 

Theorem 16.21 is the exact analogue for forest algebras of a Theorem of Stiffler |21] 
showing that a finite monoid is 7^-trivial if and only if it divides a wreath product of copies 
of Ui. Because of our conventions on the direction of the action, all our EF-algebras have 
/^-trivial, rather than 7^-trivial vertical monoids. 

7. Path Algebras and Distributive Algebras 
In this section we prove Theorems 15.31 and 15.41 



20 



M. BOJANCZYK, H. STRAUBING, AND I. WALUKIEWICZ 



7.1. Distributive algebras. We begin with Theorem 15.31 whose proof is significantly sim- 
pler than the proof of Theorem 15.41 Recall that a distributive algebra is a forest algebra 
(H, V) where H is commutative and which satisfies 

v{hi + /i2) = vhi + vh2 . 

Note that instead of the two requirements, horizontal commutativity and the above identity, 
we could use a single identity 

v{hi + /i2) = vh2 + vhi , 

which, when v = 1, gives also horizontal commutativity. Nevertheless, we prefer separating 
the two conditions. 

Theorem 15.31 says that a forest language is a boolean combination of languages EL 
(respectively, languages EL with L first-order definable) if and only if it is recognized by a 
distributive forest algebra (respectively, an aperiodic distributive forest algebra). 

The "only if" part is fairly straightforward, applying any of the identities required from 
a distributive algebra does not change the set of paths in a tree. For the "if" part, only a 
little bit of effort is needed. The idea is that by applying the conditions on distributivity, 
one can show that if a is a homomorphism into a distributive algebra, then a forest is equal 
to the sum of its paths. More precisely, if t is a forest with nodes xi, . . . ,Xn then 

a{t) = a{ti + ■ ■ ■ + tn) (7.1) 

where each tree ti is obtained by taking the node Xi and removing all nodes from t that are 
not ancestors of Xj. This is depicted in the picture below. 




Note that H, apart from being a commutative monoid, is also idempotent, by 

h = {h + l){0 + 0) = {h + l)0 + {h + l)0 = h + h . 

In particular, the value of 

a{t) = a{ti H \-tn) = a{ti) H h a(t„) 

does not depend on the order or multiplicity of types in the sequence a(ti ),..., a(tn), and 
only on the set of values {a{ti), . . . , a(t„)}. For each g G H, we define the word language 

Lg = {ai ■ ■ ■ ai G A* : a{ai ■ ■ ■ ajO) = g}. 

It is not difficult to see that a forest t from (|7.ip satisfies the formula ELh if and only if one 
of the types a(ti), . . . ,a(tn) is g- Combining the observations above, we conclude that for 
every h ^ H 

a{t)=g iff \l {l\ELg ^ l\ -ELg). 

GQH g£G geH-G 

Furthermore, if the monoid V is aperiodic, each word language L/j, as a word language 
recognized by V , is first-order definable by the McNaughton-Papert theorem. 
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7.2. Path algebras. We now proceed to prove Theorem 15. 4[ We use the term path algebra 
for a forest algebra that satisfies the conditions in the theorem, namely that the horizontal 
monoid is aperiodic and commutative, along with identities identities (jS.ip and (j5.2p . which 
we recall here 

vg + vh = v{g + h) +vO (fSTT]) 

u{g + h) = u{g + uh) for u s.t. = u (15.2p 

Recall that a path language is a boolean combination of the languages from the base of 
graded PDL: languages of the form "at least k paths in L, for some regular L. A first-order 
definable path language is defined similarly but L is required to be definable in FOa[<\- 

Theorem 15.41 says that a forest language is a path language (respectively, a first-order 
definable path language) if and only if it is recognized by a path algebra (respectively, a 
vertically aperiodic path algebra). 

The "only if" part is simple; the identities are designed to hold in any syntactic algebra 
of a path language (respectively, a first-order definable path language). The rest of this 
section is devoted to showing the "if" implication of the theorem. 

For the moment, we concentrate on path algebras, as opposed to aperiodic path alge- 
bras. After doing the proof, we show how it can be modified to obtain the case for aperiodic 
path algebras. 

We begin with the following lemma, which illustrates the significance of identity ()5.ip . 
When speaking of paths, we refer to paths that begin in one of the roots of a forest, but 
that end in any node, not necessarily a leaf. 

Lemma 7.1. Forests with the same multisets of paths have the same image under any 
homomorphism into an algebra satisfying (j5.ip and horizontal commutativity. 

Proof. We will show that two forests with the same multisets of paths are equal in the 
quotient of the free forest algebra under the identities (j5.ip and h + g = g -\- h. In other 
words, we show that if (j5.ip and h + g = g + h are treated as rewriting rules on real forests 
(and not elements of the forest algebra), then each two forests with the same multisets of 
paths can be rewritten into each other. The idea is to transform each forest into a normal 
form, such that the normal form is uniquely determined by the multiset of paths. The 
transformation into normal form works as follows. Let t be a forest. Let oi, . . . be the 
labels that appear in the roots of t. By applying horizontal commutativity, the forest t is 
rewritten into a forest 

a-iti^i + aiti^2 + • • • + CLiti^m- 

i 

By applying the identity (jS.ip and horizontal commutativity, the above is rewritten into 

(n—i) times 
ai{ti^i + ti,2 -\ h ti^m) + OiO H h fljO 

i 

Finally, for each i, we rewrite the forest tj^i + ti^2 + ■ ■ ■ + into normal form. The result of 
this rewriting is a forest where every two different non-leaf nodes have a different sequence 
of labels on their paths. Such a forest, modulo commutativity, is uniquely determined by 
the multiset of paths. □ 
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The above lemma shows membership in a language L recognized by an algebra satisfy- 
ing (|5.ip is uniquely determined by the multiset of paths in a forest. However, this on its 
own does not mean that L is a path language (otherwise, we would not need identity (j5.2p ). 
as witnessed by the following example. 

Example. Consider the language a*{a + a). A forest belongs to this language if and only 
if for some n G N, the multiset of paths is 

e,a,a2,...,a",a"+\a"+i 

This language is not a path language. It does not even belong to a quite general class 
defined below. Let a : j4* — > M be a morphism from words into a finite monoid. The 
a-profile of a forest s is a vector in that says, for each m G M, how many times a 
path with value m appears in the forest s. A language is called path-profile testable if for 
some morphism a, membership in the language is uniquely determined by the a-profile of a 
forest. It is not difficult to see that the language a*{a + a) is not even path-profile testable, 
since a path-profile testable language will confuse o"(a + a"a) with a^a^{a + a) for certain 
large values of n (more precisely for n = lo, the notion of u will be defined below). 

We now return to proving the "if" implication in Theorem 15. 4i The theorem follows 
immediately from the proposition below, by taking v to be the empty context. 

Proposition 7.2. Let (H, V) be a path algebra. For any v (z V and h G H , the forest 
language {t : va{t) = h} is a path language. 

For the rest of this section we fix a path algebra {H, V) and a homomorphism a : — )■ 
(H,V). For a tree t we will often refer to a{t) as type oft. Similarly for contexts. 

We will prove the proposition by induction on the size of the set vV C V. We write 
V ^ w if vV = wV (this is Green's 7^-equivalence in the context monoid). 

Apart from Green's relations, we will also use the w power from monoid theory. For a 
finite monoid - in this case, the monoid is ^ ~ we define to be a number such that v'^ 
is idempotent for any v (zV. Such a number always exists in a finite monoid, it suffices to 
take uj to be the factorial of the size of V. 

The induction base is when the set vV is minimal. 

Lemma 7.3. // vV is minimal, then the context v is constant, which means that vg = vh 
holds for every g,h G H 

Proof. Let /ii, . . . , /i„ be all elements of H. Consider the context 

w = v{hi + . . . + hn + 1). 

We show that the context w is constant. It suffices to show that wh = wO for every h £ H. 
Because hi, . . . ,hn contains all elements of H, then there must be some i such that hi is 
Lo ■ h, which is defined by 

UJ times 

uo-h = h-\-----\-h. 

By aperiodicity of H, we know that hi + h = hi. By commutativity of H, we see that 

/ll H h /in = /il H \- hn + h 

and therefore wh = wO. We have thus established that w is constant. Since w is constant, 
one can easily see that wu = w holds for every u £ V , and therefore wV = {w}. Since 
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w € vV, it follows that wV Q vV . By minimality of vV , we infer that vV = {w}. Because 
V contains an identity context, it follows that v G vV and therefore w = v, and therefore v 
is constant. □ 

When the context v is constant, the language in the proposition is either empty, or all 
forests, in either case it is a path language. 

We now proceed to the induction step. We fix v and h as in the statement of the 
proposition. 

A path context is a context of the form ai • • • a^D. A preserving context is a context 
p whose type satisfies va{p) = v, for our fixed v. A forest is called negligible if it is a 
concatenation of trees of the form pO, where p is a preserving context. 

Lemma 7.4. If vu ^ v then v = v{l + uO). In particular, if g is the type of a negligible 
forest, then v = v{g + 1). 

Proof. In the proof, we will use the identity 

v'^ = v'^{l + v'^0) . (7.2) 

Note that by iterating the above u times, we get 

v'^ = v''{l + uj-v'^0) . (7.3) 

In the above, uj ■ or more generally n ■ h for any number n and forest type h, denotes 
the n-fold sum h + ■ ■ ■ + h. The proof of (j7.2p is by applying the identity (j5.2p from the 
definition of path algebras: 

v'^g = v'^ig + 0) = v'^ig + v'^0) . 

We now proceed to prove the lemma, li vu ^ v then vuw = v for some w. We can 
assume that uw is idempotent, by replacing uw with (uw)^ . By the identity <\7.3h . we get 

V = vuw = v{uw){uw + uuwO) = v{uw + ujuwO) . 

Let X = {uw + uJuwO). We know that x = x + ujuwO, and therefore also x^ = x^ + ujuwO. 

V = vx^ = vx^{x^ + LouwQ) . 
By applying ()5.2p the above becomes 

VX^{1 + UJUWQ) = v{\ OJUWO). 

If we can show that ujuwO = louwO + uO, then we would be done, by 

V = v{l + ojuwO) = v{l + uuwO + uO) = v{l + nO). 
It remains to show ujuwO = louwO + uO: 

LouwO = uJuwO + uJuwO uju{wO + wO) + ujuO 
and the last expression is clearly invariant under adding nO. □ 
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A guarded context is a path context paO where the prefix pO is preserving, but the 
whole context paO is not. A forest is in guarded form if it is a concatenation of trees of 
the form pat, where paO is a guarded context. The following lemma shows that, up to 
negligible forests, each forest has the same multiset of types as some guarded context. 

We say two forests t, t' are negligibly equivalent if for some negligible forests s and s', 
the forests t + s and t' + s' have the same multiset of paths. This is indeed an equivalence 
relation (it is transitive since a concatenation of negligible forests is also negligible) . 

Lemma 7.5. Each forest is negligibly equivalent to a guarded forest. 

Proof. Let t be a forest. For each node x in t, let Qx be the path context ai • • • obtained 
by reading the path that leads to x inside t, including the node x (which has the last label 
am). Let X be the set of nodes x for which the context Qx is a guarded context, in particular, 
Qx = PxO-x^, with px a preserving path context and ax € A. Note that the set X is an 
antichain: a node x £ X is chosen as the first time when the path leading to x stops 
preserving v. For each x & X, let t\x be the subtree of the node x, the subtree includes x. 
Let t' be the forest obtained from t by removing all subtrees t\x, for x G X. The forests 

t+'^PxO and + Pxt\x 

x£X xex 

clearly have the same multisets of paths. Since "^xexP^^ ^ negligible forest, and 
Ylx&x Px:t\x is a guarded forest, it remains to prove that t' is negligibly equivalent to the 
empty forest. But this follows since all paths inside t' correspond to preserving contexts, 
by construction of t' □ 

A path language L is called guarded if it is invariant under concatenation with negligible 
forests, i.e. 

t + s € L iff t€L 
holds for any negligible forest s. 

Lemma 7.6. For any h G H there is a guarded path language such that for any guarded 
forest t, 

t £ Lh iff va{t) = vh . (7.4) 

Before showing the lemma above, we show in the lemma below that it concludes the 
proof of Proposition 17.21 



Lemma 7.7. Let h and Lh be as in Lemma 7.6. Then the equivalence ^7.4^ holds for all 
forests t, and not only guarded forests. 

Proof. Let t be a forest. By applying Lemma 17.51 we can find negligible forests s,s' and a 
guarded forest t' such that t + s and t' + s' have the same multiset of paths. 

We begin with the left to right implication in ()7.4p . Assume that t (z L^- Since s 
is negligible and the language L^ is guarded, L^ also contains t + s. Since L^ is a path 
language, and the forests t + s and t' + s' have the same multiset of paths, then L^ also 
contains t' + s'. We now apply Lemma [7^ to conclude that va(t' + s') = vh. Since t' + s' 
and t + s have the same multiset of paths, they have the same value under a by Lemma [7. 11 
This gives us 

vh = va{t' + s') = va{t + s) = va{t) , 

where the last equality is by Lemma 17.41 

The right to left implication is by reversing the above reasoning. □ 
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7.3. The path language Lh. We are only left with proving Lemma 17.61 

We say that two types g, h are t;+-equivalent if vug = vuh holds for any context vu ^ v. 

Lemma 7.8. For every h & H, there is a path language that contains all forests whose 
type is v+-equivalent to h. 

Proof. Define 

W = {w ^ vV : w ^ v} 

By definition, a forest type g & H is f +-equivalent to h if and only if wg = wh holds for all 
w & W. By the induction assumption in Proposition 17.21 we know that for every w W 
and g G H, the forest language 

Lw,g = {t -.w ■ a{t) = g} 

is a path language. The type of a forest t is u+'equivalent to h if for every context w (zW, 
the result of placing t in a context of type w is the same as the result of placing h in w. In 
other words, the set of forests whose type is v+-equivalent to h is 

which is a path language, as an intersection of path languages. □ 

Below, we write guard for the set of pairs {w, a) & V x A such that vw ~ v but 
vuja{a) ^ V. In other words, a pair [w^ a) € guard describes a guarded context pa. Consider 
a forest in guarded form 

t = piaiti H \-Pnantn ■ 

For each (vu, a) G guard, let Iw,a be the set of indexes i such that a{pi) = w and Oj = a. 
The guarded profile of this forest is the function 

Tt : guard {0,1, ... ,uj} x [H]=^^ 

which maps a pair {ui,a) to the pair {n,h), where n is the size of I^^a (up to threshold oj) 
and h is the equivalence class 

Lemma 17. 6( and thus also Proposition 17.21 and Theorem 15.41 will follow from the two 
lemmas below. 

Lemma 7.9. For a guarded forest t, the guarded profile determines the value va{t). In 
other words, if s,t are guarded forests with the same guarded profile, then va{s) = va{t). 

Lemma 7.10. For a guarded forest, the guarded profile can be determined by a path lan- 
guage. In other words, for each guarded profile r, there is a path language Lr such that 
t £ L-j- <^ Tt = T holds for all guarded forests t. 

The above two lemmas give us Lemma 17.61 by taking to be the union of all L^-, for 
profiles r of the form t = Tt where f is a guarded forest with va{t) = h. We begin with the 
proof of Lemma 17. 9[ 



26 



M. BOJANCZYK, H. STRAUBING, AND I. WALUKIEWICZ 



Proof, (of Lemma \7^ Let s, t be guarded forests with the same guarded profile. Our goal 
is to show that va[s) = va(t). 

Let r be the guarded profile of s and t. Let (ti;i,ai), . . . , {wm,am) be all elements of 
guard, and let {ki,Xi) be the value T{wi,ai). Recall that ki is the number of times a path of 
the form pai appears in the forest with a{p) = Wi. By repeatedly applying ()5.ip . horizontal 
commutativity and aperiodicity, we know that the type of s is 

a{s) = ^ Wia{ai)hi + ^(^i - 1) • Wia{ai)0 , 

i:ki>l i 

for some hi, . . . ,hn £ H such that each hi belongs to the f +-equivalence class Xj. Likewise, 
we can decompose 

a{t) = ^ Wia{ai)gi + '^{h - 1) ■ Wia{ai)d . 

i:ki>l i 

So the only difference between the types of s and t is that the first type uses hi, . . . ,hn 
and the second type uses gi, . . . ,gn- However, we know that the types hi and gi are v+- 
equivalent, for any i. We will conclude the proof by showing that 

v{wia{ai)hi + h) = v{wia{ai)gi + h) 

holds for any h G H. By applying the equality above for all i = 1, . . . , n, we get the 
desired va{s) = va{t). By definition of v+'Cquivalence, the above equality would follow if 
we showed that v{via{ai)n\ + h) v. This will be shown in Lemma 17.111 □ 

Lemma 7.11. If vu ^/j v then v{u + h) '/^ v. 

Proof. Toward a contradiction, assume that w is such that 

v{uw + h) = V 

We assume that {uw + h) is idempotent. By (j7.3p . we get 

V = v{uw + h){uw + h + uj{uwO + h)) 

v{uw + h + uj{u'wO + h)) = v{uw + uj{uwO + h)) 

Let X = uw + uj{uwO + h). By the above we know that vx = v. By definition of x we know 
that X = X + h, and in particular x^ = x^ + h. By identity (|5.2|) . we get 

x'^ = x'^{x'^ + h) =X^(l + /l) 

Therefore, 

V = vx = vx'^ = vx^{l + h) = v{l + h) 

□ 
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Now we proceed to prove Lemma 17.101 
Proof. We will show that for each 

{w, a) G guard and {i, x) E {0, 1, . . . , cj} x [H]=^^ 
there is a path language -^^(,u,a),(j,a:) such that 

i € i(u,,a),(i,x) <^ Tt{w,a) = {i,x) 
holds for any guarded forest t. This gives Lemma 17.101 by setting 

■^T 1^ ^(ui,a),T(ui,a) • 

{w,a)&guard 

Fix (w, o) and {i,x). The easier part is to enforce that the first coordinate of Tt{w,a) is i: 
we just have to say that the forest t has i paths in the word language 

Kw,a = {ai • • • Una G : a{ai ■ ■ ■ a„n) = w} . 

Only slightly more effort is required in enforcing that the second coordinate of Tt{w,a) is 
X. By Lemma |7.8| we know that the set Mx of forests whose type is in the i;+-equivalence 
class X is defined by a boolean combination of path formulas. To enforce that the second 
coordinate of Tt{w,a) is x, we use the same boolean combination, except that every word 
language is prefixed by Kyj^a- D 

As we promised before, we now prove that if the path algebra {H, V) in the statement 
of Theorem 15.41 is vertically aperiodic, then the path language only needs to use first- 
order definable word languages. It suffices to look at the only place where we actually 
wrote word languages: in the lemma above. The word language K^j^a is a word language 
obtained by concatenating a to a word language that is recognized by the vertical monoid 
y, via the morphism a i— )• a(an). Since V is aperiodic, we can use the Schiitzenberger and 
McNaughton-Papert theorem to conclude that K^^a is first-order definable. 

Actually, the argument above can be further generalized to any variety of word lan- 
guages given by monoids such that the corresponding language class is closed under con- 
catenation and contains the one letter languages {a}. Note that any such language class 
necessarily contains all first-order logic, since it captures all star-free expressions. 



8. MULTICONTEXTS AND CONFUSION 

Here we find necessary conditions for a forest algebra to be a CTL-algebra, an FO-algebra 
or a graded PDL-algebra. We use these conditions to show that certain languages cannot 
be expressed in CTL, FO, or PDL. The conditions we find are essentially the absence of 
certain kinds of configurations in the forest algebra, analogous to the 'forbidden patterns' 
of Cohen- Perrin-Pin [8j and Wilke [M] • 

Let A be a finite alphabet. A multicontext p over ^ is a forest in which some of the 
leaves have been replaced by a special symbol □, each occurrence of which is called a hole 
of the multicontext. A special kind of multicontext, called a uniform multicontext, is one in 
which every leaf node is a hole, and all subtrees at the same level are identical. For example 

a(6(cn + cD)) + a{b{cn + cD)) 

is a uniform multicontext. 
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The holes are used for substitution. The holes are independent in the sense that different 
forests can be substituted into different holes. The set of holes of a multicontext p is denoted 
holes(p). A valuation on p is a map /i : holes(p) — t- X, where X can be a set of forests, or of 
multicontexts, or elements of H, where (if, V) is a forest algebra. The resulting value, 
found by substituting ii{x) for each hole x, is consequently either a multicontext, a forest, 
or an element of H. In the last case, we are assuming the existence of a homomorphism 
a : (H, V), evaluated at the nodes of p. 

Given a set G C if we write p[G] for the set of all possible values of p[ij] where 
H : holes(p) G. When G = {g} is a singleton, we just write p[g]. For g E G and 
X € holes (p) we define p[g/x] to be the multicontext that results from p by putting a tree 
that evaluates to g in the hole x. (In particular, p[g/x] has one less hole than p.) 

We now define the various type of forbidden patterns for forest algebra. 

8.1. Horizontal confusion. Let {H,V) be a forest algebra. As above, we assume the 
existence of a homomorphism from into (H, V) in order to define the valuations on p 
with values in H. We say that (H, V) has horizontal confusion with respect to a multicontext 
p and a set G C with |G| > 1 if for every g E G and x G holes(p): 

GCp[g/x][G\. 

Intuitively, this means that fixing the value of one of the holes of p still allows us to obtain 
any element of G by putting suitable elements of G into the remaining holes. 

8.2. A;-ary horizontal confusion. We can define a stronger version of confusion, which 

seems to be satisfied by fewer forest algebras. In the stronger version, we are allowed to fix 
the value in not just one, but in A; > 1 holes: We say that the forest algebra (H, V) has 
k-ary horizontal confusion with respect to a multicontext p and a set G C H, with |G| > 1, 
if for all gi, . . . , gk & G and xi, . . . ,Xk & holes(p), 

G = p[gi/xi,--- ,gk/xk\[G]. 

The following lemma shows that the stronger notion is in fact equivalent to horizontal 
confusion, because we can always amplify horizontal confusion to fc-ary horizontal confusion 

for arbitrary k. 

Lemma 8.1. Suppose {H, V) has horizontal confusion with respect to a multicontext p and 
a subset G of H, with underlying homomorphism <p : A^ — )■ {H,V). Let k > 0. Then there 
is a multicontext pk such that {H,V) has k-ary horizontal confusion with respect to pk, G 
and 4>- 

Proof. We prove this by induction on k. We have pi = p, by hypothesis. If /c > 1, we define 
Pk by placing a copy of Pk-i in each of the holes of p. To see that this works, fix the values 
in G of A; of the holes holes of pk- If the k holes do not all belong to the same copy of 
Pk-i, then each copy has fewer than k — 1 holes fixed, and thus we can set the values in 
the remaining holes to get any elements of G we want in the holes of p, and consequently 
any element of G as a value of pk- If the k holes all belong to the same copy of Pk-i, then 
the resulting value g & G produced by this copy might be determined, but this will only 
constrain the value in one of the holes of p. Since p has horizontal confusion, we can set the 
remaining holes of p to values gi, ■ ■ ■ ,gr to obtain any desired value as output, and we can 
in turn set the values of the other copies of Pk-i to obtain these values gi,. . . ,gr. □ 
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8.3. Vertical confusion. We say that the forest algebra (H, V) has vertical confusion 
with respect to a multicontext p and a set {go, . . . ,gk-i} ^ H with /c > 1 if for every 
i = 0,...,k-l: 

p[gi] = gj where j = {i + 1) (mod k). 
This condition is weaker than periodicity of vertical monoid, because p is a multicontext, and 
not just a context. For instance, consider the syntactic forest algebra of the tree language 
L, which consists of trees where every node has two or zero children, and where every leaf 
is at even depth. 

8.4. Confusion Theorem. The next theorem shows how the various types of confusion 
are forbidden in CTL-, FO- and PDL-algebras. 

Theorem 8.2 (Confusion Theorem). 

• // {H, V) is a CTL-algebra, it does not have vertical confusion with respect to any multi- 
context. 

• If {H, V) is an FO-algebra, it does not have vertical confusion with respect to any uniform 
multicontext. 

• If {H, V) is a graded PDL-algebra, it does not have horizontal confusion with respect to 
any multicontext. 

Proof. For each of the three kinds of confusion and each of the corresponding language 
classes, we will show that the nonconfusing property (a) holds for the elements of the 
algebraic base of the class, (b) is preserved by wreath products, and (c) is preserved by 
quotients and subalgebras. 

We begin with vertical confusion and the class CTL which has U2 as an algebraic base. 
Let (f) : U2 he a homomorphism, and suppose p is a multicontext over A such that 

U2 has vertical confusion with respect to p and (j). Since U2 is distributive, we have 

p[g]=^Hu)9 + ^Hv) -0, 

where the first sum ranges over the set vr of paths in p from a root to the parent of a hole, 
and the second over the set p of paths from the root to a leaf. We claim that for g € {0, 00}, 
Pbiff]] = pIq]- This follows easily from an enumeration of the possible cases: If = p[g], 
then the claim is trivial, so we can assume that either g = 00 and p[g] = 0, or (7 = and 
p[g] = 00. In the first case, every path in vr has a prefix wa with a G A, (p{a) = cq, and 
(j){b) = 1 for every letter b of w, and every path in p has either this form or has (j){b) = 1 for 
every letter b. It follows that p[0] = 0. In the second case, some path in p has a prefix wa 
with (j){a) = Coo and (j){b) = 1 for every letter b of vu, and thus p[oo] = 00. Since = p[g] 

for all g in the horizontal monoid of U2, we cannot have vertical confusion. 

We now consider the base algebras for FO[-<]. Suppose that (f) : A^ {H,V) is a 
homomorphism into an aperiodic path algebra. Then, by Theorem 15.41 every language 
recognized by <j) is an fo path language — that is, a boolean combination of languages of 
the form E^L, where L C A* is a first-order definable word language. Let p be a uniform 
multicontext over A. Since p is uniform, every maximal path in p has the same label u G A*. 
We can dispense with the case where p has a single hole, because then p[g] reduces to (j){u) -g, 
and by aperiodicity of the vertical monoid we have, for some n > 0, ^"^^[(7] = (/){u^^^)g = 
4){u^)g = p^[g], so there is no vertical confusion. We thus suppose that p has at least two 
holes, so that is a multicontext with at least 2" holes. Since every language recognized 
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by (p is an fo path language, there exists a congruence ~ of finite index on A* and an 
integer A: > such that A*/ ~ is aperiodic, with the following property: If s,t G Ha are 
such that for every ~-class k, the number of paths from the root of s in k is equal, up to 
threshold k, to the number of paths from the root of t in k, then 0(s) = (p{t). ('Equal up 
to threshold fc' means either equal, or both at least k.) Since A*/ ~ is aperiodic, there is 
an integer r such that ~ u"^^^. Let g & H, and let s be any forest such that (j){s) = g. 
Choose q such that both q > r and 2"? > k. Now consider the forests p''^"^[s] and p''^^[s]. 
Suppose that a word occurs as the label of a path from the root in p''+^[s] more times than 
it does in Then, since p is uniform, the word must have the form u'^^^v, and since 

li^+^t; ~ u'^v, a word in the same ~-class occurs at least 2'^ > k times in p''+^[s]. It follows 
that p'^'^^lg] = = 4'{p'^~^'^[s]) = p'^^'^ig], so there is no vertical confusion. 

We now consider the base algebras for graded PDL, so we suppose (P-.A^^ {H, V) IS 
a homomorphism onto a path algebra {H, V) which has horizontal confusion with respect 
to a multicontext p and a set G C H, with |G| > 1. Let G = {gi, . . . , g-n}, and let si, . . . , 
be forests such that 0(sj) = gi for all 1 < i < n. As above, there is a congruence ~ of 
finite index on A* and an integer k, such that if two forests agree on the number of paths 
threshold k and modulo ^, then they have the same image under cj). Let m be the index of 
~ . (The only difference from the previous case is that we no longer have A* / ~ aperiodic.) 
By Lemma l8.H there is a context q such that {H, V) has km-ary horizontal confusion with 
respect to q. We order the classes of ~ arbitrarily as ki, . . . , Km- We proceed to insert forests 
from si, . . . ,Sn into the holes of q according to the following algorithm: For each Ki in turn, 
we ask if there is a way to substitute copies of the sj into the holes we have not yet filled 
in order to obtain at least k paths in Kj. If so, we perform the necessary insertions; if not 
we insert enough copies of the Sj to obtain the maximum possible number of paths in k^. 
At the end of the process, we will have filled no more than km holes. However, no further 
substitution of forests sj for the remaining holes can increase the number, threshold k of 
paths in any class of ~, and thus no matter how we fill the remaining holes, the value under 
4> will be the same. But because of the km-ary confusion, we should be able to obtain 
any value in G by appropriately filling the remaining holes. Thus \G\ = 1, so there is no 
horizontal confusion. 

We now show closure under wreath product. Suppose first that neither {Hi,Vi) nor 
{H2, V2) has vertical confusion with respect to any multicontext. Let 7 be a homomorphism 
from A'^ into the wreath product {H, V) = {Hi, Vi) o {H2, V2). Suppose {H, V) has vertical 
confusion with respect to some multicontext p with underlying homomorphism 7. There thus 
exist gi = {gf\gf'^) e H = H1XH2, with i = 0, . . . such that pj[gi] = p^[g(^i+i) modn] 

for < i < n. (Note that here we explicitly indicate the homomorphism 7, since we will 
be shortly be applying the multicontext p with respect to other homomorphisms.) By 
Theorem IMl 7 = a O /3, where a : A^ ^ {Hi,Vi) and /3 : (A x Hi)^ (i^2,V2) are 
homomorphisms. When we project onto the left co-ordinate, we obtain 

PaWi \ — y(i+i) modn- 

Since {Hi,Vi) does not have vertical confusion, all the g^^^ must be equal. We will denote 
their common value by g^^h We now form a new multicontext ^("'9 by first substituting 
any forest evaluating to g^^^ for the holes in p, which gives a forest t, then forming the 
forest t", and finally restoring the original holes. The resulting multicontext has the same 
shape as p, but its nodes are now labeled by elements of A x Hi. Because the value 5(1) is 



WREATH PRODUCTS OF FOREST ALGEBRAS, WITH APPLICATIONS TO TREE LOGICS 31 



stable after each application of pa, we find that ''bl^^] is identical to the right-hand 

coordinate oi p-y{gi), and thus we have 

(a,gW). {2h_ (2) 
PfS [i/i J — y(i+l) mod n 

(2) 

for all < z < n. Since {H2, V2) does not have vertical confusion, we find that all the gl , 
and consequently all the gi, are identical. So {H, V) does not have vertical confusion. 

In the case of vertical confusion with respect to uniform multicontexts, the proof is the 
same; we simply note that the multicontext defined above is uniform whenever p is. 

In the case of horizontal confusion with respect to some G C Hi x H2, we use essentially 
the same argument: absence of confusion in the left coordinate permits us to reduce G to 
a set of the form {g^^^} x G2, and we find that H2 has horizontal confusion with respect to 
p{o',9 g^j^jj (^2, so that IG2I = 1, and hence \G\ = 1. 

We now show that in each case the non-confusing property is preserved under division. 
For subalgebras, this is trivial, but for quotients, there is something to prove. Accordingly, 
suppose that ip : {Hi,Vi) (-^2,^2) is a surjective homomorphism of forest algebras. Let 
(f) : — > {H2,V2) be a homomorphism. We can lift this to a homomorphism vr : A'^ — > 
{Hi,Vi) such that ipn = (p. First suppose {H2,V2) has vertical confusion with respect to 
some multicontext p and (j). We will show {Hi,Vi) has vertical confusion with respect to 
p and vr. Vertical confusion in {H2, V2) gives us a sequence go, ■ ■ ■ ,gn-i of elements of H2 
with n > 1 such that Pif,[gi] = g{i+i) modn for all < i < n. Choose an element /iq G Hi 
such that '4'{ho) = go, and define hi,h2, ■ ■ ■ by /ij+i = p-wlhi]. By finiteness, there exist j < k 
such that hj = hk- Since ip{hj) = gj mod n and tpihk) = gk mod n, we have A; — j is a multiple 
of n, and in particular, k — j > 1. We thus have 

Pirihj+i] = mod (fc-i)' 

which gives vertical confusion in (Hi ,Vi). 

Now suppose that we have horizontal confusion in {H2, V2) with respect to (p. We will 
show how to obtain horizontal confusion in {Hi,Vi). Let m = \Hi\. By Lemma l8.ll there 
is a multicontext p such that {H2,V2) has m-ary horizontal confusion with respect to (f) 
and some set G' C H2. Let G = tp-HG')- For A; > 0, set = Since p[G'] = G', 

we have ip{G\) = G' for all k. In particular, G\ C G, and by repeatedly applying p to 
both sides of this inclusion we obtain G^'^^ C for all k. Thus this sequence eventually 
stabilizes, so we have some n for which = G". Let us set Gi = G". Now it may be 

that (1^1,^1) has horizontal confusion with respect to p, vr, and Gi. If not, there is some 
hole X oi p and gi G Gi such that p[(7i/x][Gi] C Gi. So we let p' be the multicontext that 
results from substituting a forest that evaluates under vr to gi for x, and set G2 = p'[Gi]. 
Note that {H2, V2) has (m — l)-ary horizontal confusion with respect to p' and (j), so we still 
have ipiGl) = G' , as well as G2 = p'[Gi] =C Gi, so that IG2I < |G|. We now repeat the 
procedure above, applying p' to G\ until the sequence stabilizes at a set G2, then checking 
if the result is a horizontal confusion for {Hi, Vi), and filling a hole of p' if it is not. We 
have 

< •••IGfcl < |Gfc„i < < |G|, 

so the process will terminate after no more than |G| — |G'| generations, giving a horizontal 
confusion in [Hi ,Vi). □ 
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Theorem 8.3. It is decidable if a given forest algebra has horizontal confusion, vertical 
confusion, or vertical confusion with respect to a uniform context. 

Proof. Confusion in a forest algebra {H, V) appears to depend on the choice of alphabet A, 
a multicontext p over A, and a morphism from A'^ into {H, V). Observe, however, that we 
can restrict attention to a single alphabet and morphism: Consider V as a finite alphabet, 
and the morphism P : (H, V) induced by the identity map on V. If {H, V) has a 

confusion with respect to a multicontext p over A and morphism a : A^ — > {H, V), then we 
can transform it into a confusion of the same type with respect to V and /3 in the obvious 
fashion, replacing each node label a € A oi p labeled by a{a) G V. Thus in the argument 
below, we suppress explicit mention of an alphabet and morphism and work simply with 
the elements of V. 

Vertical confusion. Testing whether {H,V) has vertical confusion with respect to some 
multicontext reduces to verifying whether a certain monoid containing V is aperiodic. If 
v,w G V, we define v + w tohe the transformation on H given by 

{v + w)h = vh + wh 

for all h e H. Let V be the collection of all maps on H containing V and closed under 
composition and addition. V then consists of all multicontexts over {H, V) . Furthermore, 
V is effectively computable from V, since whenever we have a set U of transformations on 
H, we can check for each v,w £ U whether v + w and vw belong to U, and if not, adjoin 
them to U. Since there are only finitely many transformations on H, we eventually reach a 
stage at which we can add no new elements to U, at which point the algorithm terminates. 

y is a monoid under composition, and {H, V) is free of vertical confusion if and only if 
this monoid is aperiodic; i.e., if and only if p'^ = p'^'^^ for all p eV and sufficiently large k, 
which we can determine effectively. 

Vertical confusion with respect to a uniform multicontext. The argument is the same as 
above, however now we must build a monoid containing V that consists of exactly all the 
uniform multicontexts. We accordingly close V under composition and the operations 

vi-^v + v + -- - + v. 

Observe that the number of summands in this expression can be bounded above by the size 

of H, so we can compute this closure effectively as well. Let us denote the resulting monoid 
V. {H, V) does not have vertical confusion with respect to any uniform multicontext if and 
only if V is aperiodic. 

Horizontal confusion. We now test if {H, V) has horizontal confusion. The algorithm first 
guesses the set G. For a multicontext p, we define its profile to be the set 

Tr{p) = {p[g/x][G] :geG,x€ holes(p)} x p[G] G PiPiH)) x P{H). 

The forest algebra has horizontal confusion with respect to a multicontext p and G if and 
only if the profile tt{p) only has supersets of G on the first coordinate. Therefore, to 
determine if the forest algebra has horizontal confusion, it suffices to compute the set 

Y = {irip) : p is a multicontext}. 
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This set is computed using a fix-point algorithm, since it is the least set that satisfies the 
properties listed below. (In the implications, we lift the forest algebra operations to sets 
F Q H and families of sets J- C P{H) in the natural way.) 

{{{g}:g^G},G)^Y 

(7", F) e y ^ (t; J", vF) G Y for every v 

(J-i, Fi), (^2,^2) G y ^ (^1 + F2 U Fi + Fi + F2) 

□ 

9. Applications 

Here we apply the results of the preceding section to exhibit a forest language in CTL* that 
is not in CTL, a language in PDL that is not in FO[^], and a language that is not in graded 
PDL. All of our examples have syntactic forest algebras with aperiodic vertical monoids, and 
all the classes in question contain languages with arbitrarily complicated aperiodic vertical 
monoids, so we really do need machinery of forest algebras to give algebraic proofs of these 
separations. 

9.1. Forests with a maximal path in (ab)*. Consider the set Li of forests over A = {a, b} 
in which there is a maximal path — that is, a path from a root to a leaf — in (ah)*. This 
language is in CTL*. To see this, note that (p = E{A~^) is a forest formula in CTL* defining 
the set of nonempty forests. Consider the formally disjoint formulas 

(j)l = bA(f) 4>2 = b/\^(j)i 4>3 = ^4>i ^ ^(t>2- 

The formula (pi holds in non-leaf nodes with label b, the formula (j)2 holds in leaves with label 
6, and the formula (^3 holds in nodes with label a. Then Li is defined by the CTL* forest 
formula E(((/>3(^i)*((^3(/)2))- We claim that Li is not in CTL. To do this, by Theorem 18. 2^ we 
need only exhibit a multicontext p with respect to which the syntactic forest algebra of Li 
has vertical confusion. Let p = oD -|- Let /iq be the class of the tree b in the syntactic 
congruence of Li, and let hi be the class of the tree ab. Observe that ho and hi are distinct 
horizontal elements of the syntactic algebra, since hi contains elements of Li and ho does 
not. We have vertical confusion, because p[h()] is then the class of ab + bb, which is hi, and 
p[hi] is the class of aab + bab, which is /iq- 

9.2. Binary trees with even path length. This example uses unlabeled binary trees, 
which are trees over a one-letter alphabet {a} where every node has zero or two children. 
Let L2 be the set of unlabeled binary trees where every path from the root to a leaf has 
even length. Let p be the uniform multicontext a(n -|- □). Let ho denote the set of binary 
trees in which every maximal path has even length, and hi the set of binary trees in which 
every maximal path has odd length. These are distinct classes in the syntactic congruence 
of L2. Obviously p[ho] = hi and p[hi] = ho, so we have vertical confusion with respect to a 
uniform multicontext, and thus by Theorem 18.21 L2 is not in F0[^]. 

An argument due to Potthoff |19j can be used to show that F2 is definable in first-order 
logic in which there is both the ancestor and the next-sibling relations. Languages definable 
in FO[-<] are obviously in the intersection of the class of languages definable in FO with 
-< and the next-sibling relationship, and the class of languages L with commutative Hj^. 
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This example shows that the containment is strict. Note that L2 is expressible in graded 
PDL so we have also established that the languages in graded PDL with aperiodic forest 
algebras need not be definable in (there is even an example, also due to Potthoff', 

which shows that languages definable in graded PDL with aperiodic forest algebras need 
not be definable in FO with -< and the next-sibling relationship). 

9.3. (Boolean expressions). Consider the set L3 of trees over the alphabet {0, 1, V, A} that 
are well-formed boolean expressions {i.e., all the leaf nodes are labeled or 1, and all the 
interior nodes are labeled V or A) that evaluate to 1. L3 is contained in a single equivalence 
class of the syntactic congruence, as is the set of well-formed trees that evaluate to 0. We 
denote the corresponding elements of Hl,^ by hi and /iq. 

Now consider the multicontext p = V(A(n -|- □) -|- A(n -|- □)). We can fix a value 1 or 

in any single hole, and then set the remaining holes to obtain either a tree evaluating to 

1 or a tree evaluating to 0. Thus the syntactic algebra of L3 has horizontal confusion with 
respect to the multicontext p and the set {ho,hi}, and so is not in graded PDL. Observe 
that the vertical component of the syntactic algebra of L3 is aperiodic: In contrast to the 
word case, languages recognized by aperiodic algebras are not necessarily expressible in 
first-order logic, or even in graded PDL. 

9.4. Horizontally idempotent and commutative algebras. Obviously, we can sep- 
arate CTL* and PDL from F0[^] and graded PDL, respectively, because the syntactic 
algebras for the former classes have idempotent and commutative horizontal parts, while 
for the latter the horizontal components need only be aperiodic and commutative. Thus, 
for example, any language in FO[-<] that fails to satisfy the idempotency condition is not 
in CTL* . We can use our algebraic methods to show that this is in fact the only distinction: 

Theorem 9.1. Let {H, V), {Hj,Vj), j = 1, . . . ,k be forest algebras such that H is idem- 
potent and commutative, each {Hi,Vi) is a path algebra, and such that {H,V) divides 
{Hi, Vi) o • • • o {Hk, Vk). Then each {Hi, Vi) has a distributive homomorphic image {H'^^, ¥[) 
such that {H, V) divides {H[,Vi) o • • ■ o {H'^, Vl). 

Proof. Let {H, V) be a path algebra. We define e{H) to be the set of idempotents of 
H. By the commutativity of H, the sum of two idempotents is idempotent. Thus e{H) 
is an idempotent and commutative submonoid of H. If h £ H and k is a nonnegative 
integer, we denote hy k ■ h the sum of k copies of h. We also denote by uh the unique 
idempotent in {k ■ h : k e N}. Since H is aperiodic and commutative, there exists k such 
that u ■ h = k ■ h = {k + 1) ■ h for all h e H. 

For every v eV, we define a function v : e{H) — >■ e{H) by 

V ■ e = uj{ve). 

Wc define a forest algebra {e{H),V) as follows. The horizontal monoid is e{H). The 
vertical monoid isV = {v:v£ V}, with function composition. The action is by applying 
the function v to an argument e G e{H). To prove that this is a forest algebra, we need to 
show that for any element e G e{H), there is an element v eV such that vf = e + f holds 
for any / G e{H). This element is simply e + D. Indeed, 

{7+n) ■ f = co{{e + □)/) = a;(e + /) = e + /. 



WREATH PRODUCTS OF FOREST ALGEBRAS, WITH APPLICATIONS TO TREE LOGICS 35 



This concludes the proof that {e{H), V) is a forest algebra. 

We now show that (e{H), V) is distributive. In other words, the following identity holds 
for any v & V and 61,62 € e{H). 

1^(61 + 62) = W61 +1762 (9.1) 

Using the first identity in the definition of path algebras ()5.ip . we obtain 

v{ei + 62) = ujv{ei + 62) 

= a;(t;(6i + 62) + v(6i + 62)) 

= a;(u(6i + 62 + 61 + 62) + V • 0) 

= uj{v{ei + 62) + V ■ 0) 

= oj{vei + ^62) 

= covei + CO ■ f 62 

= vei + U62 . 

Finally, we prove that the function 

a{h) = oj ■ h a{v) = v 

is a forest algebra homomorphism 

a:{H,V)^{e{H),V). 

Clearly a preserves +. It remains to show that it preserves the remaining two operations 
of forest algebra, namely inserting a forest into a context and composition of two contexts. 
For inserting a forest into a context, we have 

a{vh) = uj ■ vh = vh = a{v)a{h). 

For composition of two contexts we need to show a{vi)a{v2) = a{yiV2). Since V is defined 
as a set of functions on e{H)^ we need to show that both sides of the equality describe the 
same function on e{H). In other words, we have to prove that for every e € e{H), 

vT{v2e) = vTv^e (9.2) 

First note that the path algebras property (15. 1[) implies that for all h € H, v £ V, m & 
{1, 2, . . .}, we have 

m ■ (vh) = v{m ■ h) + {m — 1) • (fO). 

Thus by aperiodicity of H, 

uj{vh) = v{ujh) + uj{vO). 
If 6 G is idempotent, this becomes 

uj{ve) = ve + uj{vO). 

Consequently we have 
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viV2e = uj{viV2e) 

= viuj{v2e) + uj{vi ■ 0) 

= vi{v2e + u}{v2 ■ 0)) + u}{vi ■ 0) 

= Vl{v2e + U}{V2 ■ 0) + UJ{V2 ■ 0)) + U!{vi ■ 0) 

= vi{u{v2e) + uj{v2 ■ 0)) + u{vi ■ 0) 
= viuj{v2e + uj{v2 ■ 0)) + uj{vi ■ 0) 

= 0JVl{v2e + UJ{V2 ■ ^)) 

= ujvi{ojV2e) 
= vi{v2e). 

Summing up: We have defined a forest algebra homomorpliism 

a:{H,V)^{e{H),V) 

where the target forest algebra is distributive and horizontally commutative and idempotent. 

Suppose now that (i?, V) is a forest algebra with idempotent and commutative H that 
divides a wreath product 

{Hi,Vi)o...o{Hu,Vk). 
where each {Hi, V^) is a path algebra. To complete the proof of the theorem, we will show 
that {H, V) divides 

We now apply Lemma 14.11 on the equivalence of the two definitions of division. The 
hypothesis is then that there is a submonoid H' of Hi x • • • x H^, and a homomorphism 
/ from H' onto H with the following property: For each v & V there is i) in the vertical 
monoid of {Hi, Vi) o • • • o {Hk, Vfc) such that for all h £ H' , 

f{vh)=vf{h). 

Note that h has the form {hi, . . . , h^), where hi € Hi for i = 1, . . . ,k, and 

V = {u,g2, . . .,gk), 

where u £Vi and each 

gj : Hi X ■ ■ ■ X Hj^i Vj 
is a map. Since H is idempotent, f{ujh) = f{h) for all h € H' . We consider the restriction 
of / to e{H'), which is a subset of e{Hi) x ■ ■ ■ x e{Hk)- We will show that for each v 
there is an element v of the vertical monoid of {e{Hi), Vi) o • • • o (e(i?fe), V^) such that for 
ah e G e{H), 

f{ve) = vf{e). 

To do this, we simply alter {) = (n, (72, • • • , in the obvious fashion: 

V = {u,g^,. . . ,gj:), 

where by definition 

9j{x) = gj{x). 
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Set e = (ei, . . . , e^) G e{H'). We have 

five) = /(Mei,52(ei)e2,...,gfc(ei,...,efc-i)efc) 
= f(uei,g2{ei)e2, ■ ■ ■ ,5fc(ei, • • . ,efc_i)efc) 
= f{uj{uei),ujg2{ei)e2, • • • , ujgkiei, ek-i)ek) 
= /(wei, 52(61)62, . . .,gkiei, . . . ,efc_i)efc) 
= five) 
= vfie), 
which completes the proof. 

□ 

Theorem 15.21 immediately yields the following corollary: 

Theorem 9.2. A forest language is definable in CTL* (respectively PDL) if and only if 
it is definable in FO[-<] (respectively graded PDL) and its syntactic algebra is horizontally 
idempotent. 

The first of these facts follows from a result of Moller and Rabinovich |16j who show 
that over infinite trees properties expressible in CTL* are exactly the bisimulation-invariant 
properties expressible in monadic path logic. 

10. Conclusion and further research 

Results like those in Section [9] are typically proved by model-theoretic methods. Here we 
have demonstrated a fruitful and fundamentally new way, based on algebra, to study the 
expressive power of these logics. 

Of course, the big question left unanswered is whether we can establish effective neces- 
sary and sufficient conditions for membership in any of these classes. We do not expect that 
the conditions established in Theorem 18. 21 are sufficient. The approach outlined in Section [6] 
may constitute a model for how to proceed: a deeper understanding of the ideal structure 
of forest algebras can lead to new wreath product decomposition theorems. 

In a sense, we are searching for the right generalization of aperiodicity. For regular lan- 
guages of words, aperiodicity of the syntactic monoid, expressibility in first-order logic with 
linear ordering, expressibility in linear temporal logic, and recognizability by an iterated 
wreath product of copies of the aperiodic unit U2 are all equivalent. For forest algebras, the 
obvious analogues are, respectively, aperiodicity of the vertical component of the syntactic 
algebra, expressibility in F0[-<], expressibility in CTL, and recognizability by an iterated 
wreath product of copies of ZY2. As we have seen, only the last two coincide. Understanding 
the precise relationship among these different formulations of aperiodicity for forest algebras 
is an important goal of this research. 

Another way of looking at this research is that it sets the scene for a Krohn-Rhodes 
theorem for trees. The Krohn-Rhodes theorem states that every transition monoid divides 
an iterated wreath product of transition monoids which are either U2 or groups that divide 
the original monoid. The ingredients of the theorem are therefore: a notion of wreath 
product, a notion of an easy transition monoid U2, and a notion of a difficult transition 
monoid (a group). For our purposes here, we are particularly interested in the (already 
quite difficult) version of the theorem which states that every aperiodic transition monoid 
divides a wreath product of copies of U2- In this paper, we have provided some of the 
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ingredients: the wreath product and the easy objects. (There are several candidates for the 
easy objects, e.g. simply U2 or maybe path algebras. There are probably several Krohn- 
Rhodes theorems). Wc have provided examples of properties one expects from the difficult 
objects (the various types of confusion), but we still have no clear idea what they are (in 
other words, what is a tree group?). We have also shown that the wreath product is strongly 
related to logics and composition. Finding (at least one) Krohn-Rhodes theorem for trees 
is probably the most ambitious goal of this research. 

References 

[1] Pablo Barcelo and Leonid Libkin. Temporal logics over unranked trees. In LICS, pages 31-40. IEEE 
Computer Society, 2005. 

[2] M. Benedikt and L. Segoufin. Regular tree languages definable in FO. In Volker Diekert and Bruno 
Durand, editors, STAGS, volume 3404 of Lecture Notes in Computer Science, pages 327-339. Springer, 
2005. 

[3] M. Bojanczyk and I. Walukiewicz. Forest algebras. In Erich Gracdcl Joerg Flum and Thomas Wilke, 

editors. Logic and Automata: History and Perspectives. Amsterdam University Press, 2008. 
[4] Mikolaj Bojanczyk. Two-way unary temporal logic over trees. In LICS, pages 121-130, 2007. 
[5] Mikolaj Bojanczyk and Luc Segoufin. Tree languages defined in first-order logic with one quantifier 

alternation. In ICALP, pages 233-245, 2008. 
[6] Mikolaj Bojanczyk, Luc Segoufin, and Howard Straubing. Piecewise testable tree languages. In LICS, 

pages 442-451. IEEE Computer Society, 2008. 
[7] Janusz A. Brzozowski and Robert Knast. The dot-depth hierarchy of star-free languages is infinite. J. 

Comput. Syst. Set., 16(l):37-55, 1978. 
[8] JocUe Cohen, Dominique Perrin, and Jean-Eric Pin. On the expressive power of temporal logic. J. 

Comput. Syst. Sci., 46(3):271-294, 1993. 
[9] Z. Esik and P. Weil. On logically defined recognizable tree languages. In Paritosh K. Pandya and 

Jaikumar Radhakrishnan, editors, FST TCS 2003: Foundations of Software Technology and Theoretical 

Computer Science, 23rd Conference, Mumbai, India, December 15-17, 2003, Proceedings, volume 2914 

of Lecture Notes in Computer Science, pages 195-207. Springer, 2003. 
[10] Z. Esik and P. Weil. Algebraic recognizability of regular tree languages. Theor. Comput. Sci, 340(1) :291- 

321, 2005. 

[11] Zoltan Esik. Characterizing CTL-like logics on finite trees. Theor. Comput. 5ci. , 356(1-2):136-152, 2006. 
[12] Zoltan Esik and Szabolcs Ivan. Aperiodicity in tree automata. In Symeon Bozapalidis and George 

Rahonis, editors, CAI, volume 4728 of Lecture Notes in Computer Science, pages 189-207. Springer 

Springer, 2007. 

[13] Zoltan Esik and Ivan Szabolcs. Some varieties of finite tree automata related to restricted temporal 

logics. Fundamenta Informaticae, 82:79-103, 2008. 
[14] T. Hafer and W. Thomas. Computation tree logic CTL and path quantifiers in the monadic theory of 

the binary tree. In International Colloquium on Automata, Languages and Programming, volume 267 

of Lecture Notes in Computer Science, pages 260-279, 1987. 
[15] L. Libkin. Logics for unranked trees: an overview. In Automata, languages and programming, volume 

3580 (jf Lecture Notes in Comput. Sci., pages 35-50. Springer, Berlin, 2005. 
[16] Faron MoUer and Alexander Moshe Rabinovich. On the expressive power of CTL. In LICS, pages 

360-369, 1999. 

* 

[17] Faron Moller and Alexander Moshe Rabinovich. Counting on CTL : on the expressive power of monadic 

path logic. Inf. Comput, 184(1): 147-159, 2003. 
[18] T. Place and L. Segoufin. A decidable characterization of locally testable tree languages. In International 

Colloquium on Automata, Languages and Programming, pages 285-296, 2009. 
[19] A. Potthoff. First-order logic on finite trees. Lecture Notes in Computer Science, 915:125-139, 1995. 
[20] R. McNaughton and S. Papert. Counter-free Automata. MIT Press, Cambridge, USA, 1971. 
[21] P. Stiffier. Extensions of the fundamental theory of finite semigroups. Advances in Mathematics, 11:159- 

209, 1973. 



WREATH PRODUCTS OF FOREST ALGEBRAS, WITH APPLICATIONS TO TREE LOGICS 39 



[22] H. Straubing. A generalization of the Schiitzenberger product of finite monoids. Theor. Comput. Sci., 
13:137-150, 1981. 

[23] H. Straubing. Finite Automata, Formal Logic, and Circuit Complexity. Progress in Theoretical Com- 
puter Science. Birkhauser Boston Inc., Boston, MA, 1994. 

[24] Thomas Wilke. Classifying discrete temporal properties. In Christoph Meinel and Sophie Tison, editors, 
STACS, volume 1563 of Lecture Notes in Computer Science, pages 32-46. Springer, 1999. 

[25] Zhilin Wu. A note on the characterization of TL[EF]. Information Processing Letters, 102((2-3)):28-54, 
2007. 



This work is licensed under the Creative Commons Attribution-NoDerivs License. To view 
a copy of this license, visit http://creativecommons.0rg/iicenses/by-nd/2.o/ or send a 
letter to Creative Commons, 171 Second St, Suite 300, San Francisco, CA 94105, USA, or 
Eisenacher Strasse 2, 1 0777 Berlin, Germany 



